Torch-TensorRT¶
In-framework compilation of PyTorch inference code for NVIDIA GPUs¶
Torch-TensorRT is a inference compiler for PyTorch, targeting NVIDIA GPUs via NVIDIA’s TensorRT Deep Learning Optimizer and Runtime.
It supports both just-in-time (JIT) compilation workflows via the torch.compile
interface as well as ahead-of-time (AOT) workflows.
Torch-TensorRT integrates seamlessly into the PyTorch ecosystem supporting hybrid execution of optimized TensorRT code with standard PyTorch code.
More Information / System Architecture:
Getting Started¶
User Guide¶
Tutorials¶
Overloading Torch-TensorRT Converters with Custom Converters
Using Custom Kernels within TensorRT Engines with Torch-TensorRT
Dynamo Frontend¶
TorchScript Frontend¶
FX Frontend¶
Model Zoo¶
Compiling ResNet with dynamic shapes using the torch.compile backend
Compiling Stable Diffusion model using the torch.compile backend