Projects (2)
NVIDIA/Model-Optimizer
A unified library of SOTA model optimization techniques like quantization, distillation, pruning, neural architecture search, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.
3.0k455Python3mo ago1 Claude commits
NVIDIA/NVFlare
NVIDIA Federated Learning Application Runtime Environment
943263Python3mo ago1 Claude commits