Torch-TensorRT¶
In-framework compilation of PyTorch inference code for NVIDIA GPUs¶
Torch-TensorRT is a inference compiler for PyTorch, targeting NVIDIA GPUs via NVIDIA’s TensorRT Deep Learning Optimizer and Runtime.
It supports both just-in-time (JIT) compilation workflows via the torch.compile
interface as well as ahead-of-time (AOT) workflows.
Torch-TensorRT integrates seamlessly into the PyTorch ecosystem supporting hybrid execution of optimized TensorRT code with standard PyTorch code.
More Information / System Architecture:
Getting Started¶
User Guide¶
Tutorials¶
Overloading Torch-TensorRT Converters with Custom Converters
Using Custom Kernels within TensorRT Engines with Torch-TensorRT
Dynamo Frontend¶
TorchScript Frontend¶
FX Frontend¶
Model Zoo¶
Compiling ResNet with dynamic shapes using the torch.compile backend
Compiling Stable Diffusion model using the torch.compile backend
Compiling GPT2 using the Torch-TensorRT torch.compile frontend