In-framework compilation of PyTorch inference code for NVIDIA GPUs¶
Torch-TensorRT is a inference compiler for PyTorch, targeting NVIDIA GPUs via NVIDIA’s TensorRT Deep Learning Optimizer and Runtime.
It supports both just-in-time (JIT) compilation workflows via the
torch.compile interface as well as ahead-of-time (AOT) workflows.
Torch-TensorRT integrates seamlessly into the PyTorch ecosystem supporting hybrid execution of optimized TensorRT code with standard PyTorch code.
More Information / System Architecture: