Dynamo Converters¶

The dynamo converter library in Torch-TensorRT is located in TensorRT/py/torch_tensorrt/dynamo/conversion.

Steps¶

Operation Set¶

The converters in dynamo are produced by aten_trace and falls under aten_ops_converters ( FX earlier had acc_ops_converters, aten_ops_converters or nn_ops_converters depending on the trace through which it was produced). The converters are registered using dynamo_tensorrt_converter for dynamo. The function decorated has the arguments - network, target, args, kwargs, name, which is common across all the operators schema. These functions are mapped in the aten converter registry dictionary (at present a compilation of FX and dynamo converters, FX will be deprecated soon), with key as the function target name.

aten_trace is produced by torch_tensorrt.dynamo.trace(..) for the export path and torch_tensorrt.compile(ir=dynamo) for the compile path.

The export path makes use of aten_tracer whereas the alternate trace in compile is produced by the AOT Autograd library. Both these simplify the torch operators to reduced set of Aten operations.

As mentioned above, if you would like to add a new converter, its implementation will be included in TensorRT/py/torch_tensorrt/dynamo/conversion/impl Although there is a corresponding implementation of the converters included in the common implementation library present in TensorRT/py/torch_tensorrt/fx/impl for FX converters, this documentation focuses on the implementation of the aten_ops converters in dynamo.

Converter implementation¶

In this section, we illustrate the steps to be implemented for writing a converter. We divide them according to activation, operator, lowering pass implementation or evaluator. Each of them is detailed with the help of an example

Registration
The converter needs to be registered with the appropriate op code in the dynamo_tensorrt_converter.
Activation type
Example: leaky_relu
aten_ops_converters: Dynamo_converters
Define in py/torch_tensorrt/dynamo/conversion/aten_ops_converters. One needs to register the opcode generated in the trace with dynamo_tensorrt_converter decorator. Op code to be used for the registration or the converter registry key in this case is torch.ops.aten.leaky_relu.default
@dynamo_tensorrt_converter(torch.ops.aten.leaky_relu.default)
    def aten_ops_leaky_relu(
        network: TRTNetwork,
        target: Target,
        args: Tuple[Argument, ...],
        kwargs: Dict[str, Argument],
        name: str,
    ) -> Union[TRTTensor, Sequence[TRTTensor]]:
            return activation.leaky_relu(network, target, SourceIR.ATEN, name, args[0], args[1])
The tensorrt_converter (used for FX registration) and dynamo_tensorrt_converter are similar decorator functions with some differences.

Both register the converters in the registeries (python dictionaries) - CONVERTERS and DYNAMO_CONVERTERS respectively. These are two dictioneries which are concatenated to form the overall converter registry

The dictionary is keyed on the OpOverLoad which is mentioned in more detail below with examples

Both return the decorated converter implementation

The CONVERTERS directly registers the decorated converter_implementation function, while DYNAMO_CONVERTERS has additionational arguments and registers the ConverterSupport object

The additional arguments are:

key: Node target for which the converter is implemented for (for example, torch.ops.aten.leaky_relu.Tensor)

enabled: Whether the converter should be enabled/cached or not

capability_validator: Function which evaluates whether a node is valid for conversion by the decorated converter. It defaults to None, implying the capability_validator function is always true. This means all nodes of “key” kind can be supported by this converter by default. See embedding example for more details

priority: Converter’s level of priority relative to other converters with the same target

The ConverterSupport is a compilation of converter_implementation and capability_validator.

The function decorated by tensorrt_converter and dynamo_tensorrt_converter has the following arguments which are automatically generated by the trace functions mentioned above.

network : Node in the form of call_module or call_function having the target as the key

target: Target key in the call_module or call_function above. eg: torch.ops.aten_.leaky_relu.default. Note that torch.ops.aten._leaky_relu is the OpOverloadPacket while torch.ops.aten_.leaky_relu.default is OpOverload.

args: The arguments passed in the call_module or call_function above

kwargs: The kwargs passed in the call_module or call_function above

name: String containing the name of the target

As a user writing new converters, one just needs to take care that the approriate arguments are extracted from the trace generated to the implementation function in the implementation lib function activation.leaky_relu (which we will discuss below in detail).
Operation type

Example: fmod

It follows the same steps as the above converter. In this case the opcode is torch.ops.aten.fmod.Scalar or torch.ops.aten.fmod.Tensor. Hence both the opcodes are registered in py/torch_tensorrt/dynamo/conversion/aten_ops_converters. Note that torch.ops.aten.fmod is the OpOverLoadPacket while the registry is keyed on torch.ops.aten.fmod.Scalar or torch.ops.aten.fmod.Tensor, which is OpOverLoad

Example: embedding

It follows the same steps as the above converter. In this case the opcode is torch.ops.aten.embedding.default. There are some converters which have special cases to be accounted for. In those cases, one should use capability_validators to register the converter using @dynamo_tensorrt_converter We illustrate this through torch.ops.aten.embedding.default. It has parameters - scale_grad_by_freq and sparse which are not currently supported by the implementation. In such cases we can write validator embedding_param_validator which implements that given those paramters the converter is not supported and register the converter by

So if there is a new converter in which certain special cases are not to be supported then they can be specified in the capability_validator.

Evaluator type

Example: operator.getitem

Evaluators are categorized as so since they do not make any modification to the graph. This is implemented in py/torch_tensorrt/dynamo/conversion/op_evaluators.py, with the corresponding capbility_validator. The opcode is operator.getitem.
Implementation Library
The dynamo converters would be located in py/torch_tensorrt/dynamo/conversion/impl
Activation
Example: leaky_relu

The implementation is to be placed in present in py/torch_tensorrt/dynamo/conversion/impl/activation.py. This is where all the activation functions are defined and implemented.
def leaky_relu(
    network: TRTNetwork,
    target: Target,
    source_ir: Optional[SourceIR],
    name: str,
    input_val: TRTTensor,
    alpha: Optional[Any],
):
    #implementation
The implementation function has the following arguments.

network : network passed from the decorated function registration

target: target passed from the decorated function registration

source_ir: Enum attribute. SourceIR enum is defined in py/torch_tensorrt/dynamo/conversion/impl/converter_utils

name: name passed from the decorated function registration

input_val: Approriate arguments extracted from the decorated function registration from args or kwargs

alpha: Approriate arguments extracted from the decorated function registration from args or kwargs. If not None, it will set the alpha attribute of the created TensorRT activation layer eg: Used in leaky_relu, elu, hardtanh

beta: Approriate arguments extracted from the decorated function registration from args or kwargs. If not None, it will set the beta attribute of the created TensorRT activation layer eg: Used in hardtanh

dyn_range_fn: A optional function which takes the dynamic range of a TensorRT Tensor and returns the output dynamic range

The implementation functions call the convert_activation function in py/torch_tensorrt/dynamo/conversion/impl/activation.py. This function will add the approriate activation layer via network.add_activation.
Operator

The implementation is to be placed in py/torch_tensorrt/dynamo/conversion/impl/elementwise/ops.py for dynamo. This is where all the elementwise functions are defined and implemented. For a new operator, one should identify the category to which it belongs. Following are some examples

Elementwise operators like fmod is present in py/torch_tensorrt/dynamo/conversion/impl/elementwise. The py/torch_tensorrt/dynamo/conversion/impl/elementwise/base contains base functions for elementwise operator.

Unary operators like sqrt will be present in py/torch_tensorrt/dynamo/conversion/impl/unary. The py/torch_tensorrt/dynamo/conversion/impl/unary/base contains base functions for unary operator.

Normalization operators like softmax, layer_norm, batch_norm will be present in py/torch_tensorrt/dynamo/conversion/impl/normalization. Since there are no base operations common to all, there is no base file. But one can choose to implement a base file, if there are common functions across all normalization operations

Individual operators like slice, select, where, embedding will be present in py/torch_tensorrt/dynamo/conversion/impl/*.py. They will have individual operator implementation with the same API structure as above but with different individual arguments

Please note that the above operators would have common functions to be implemented which should be placed in py/torch_tensorrt/dynamo/conversion/impl/converter_utils.py
Lowering type
There are some converters which can be decomposed into suboperations and need not have seperate converter registration. Such converters can be implemented via lowering passes

Example: addmm

The decompositions are registered via register_decomposition in py/torch_tensorrt/dynamo/backend/lowering/_decompositions.py We define addmm_replacement and replace it with the torch ops, which will have their corresponding converters called.
@register_decomposition(torch.ops.aten.addmm, registry=DECOMPOSITIONS)
def addmm_replacement(
    input_: torch.Tensor, mat1: torch.Tensor, mat2: torch.Tensor, *, beta=1, alpha=1
) -> torch.Tensor:
    return torch.add(
        torch.mul(input_, beta), torch.mul(torch.matmul(mat1, mat2), alpha)
    )
Note that there are some pre-existing dynamo decompositions in torch directory, in which case they should be used, In that case please enable the decompositions in py/torch_tensorrt/dynamo/lowering/_decomposition_groups.py in torch_enabled_decompositions. Similarly you can choose to disable any in torch_disabled_decompositions. Please note that the ones already defined in the lowering will take precedence over torch lowering ops.

Tests¶

Dynamo testing:
Dynamo tests are present for the lowering ops in tests/py/dynamo/lowering/test_decompositions.py. The above converters will soon be ported to dynamo tests
1. Compare the results for fx.symbolic_trace and torch_tensorrt.dynamo.compile.
2. Test for the expected_op and the unexpected_op.
  expected_op: Operations the operations are lowered to. eg: mul and add for addmm
  
  Note that specify that disable_passes= True for cases where you would not want lowering passes (which should be the default when testing converters)
  
  unexpected_op: Original operation. eg: addmm for addmm

The tests should fail if any of the above two conditions fail

Dynamo Converters¶

Steps¶

Operation Set¶

Converter implementation¶

Tests¶

Docs

Tutorials

Resources