Dynamo Converters¶
The dynamo converter library in Torch-TensorRT is located in TensorRT/py/torch_tensorrt/dynamo/conversion
.
Steps¶
Operation Set¶
The converters in dynamo are produced by aten_trace
and falls under aten_ops_converters
( FX earlier had acc_ops_converters
, aten_ops_converters
or nn_ops_converters
depending on the trace through which it was produced). The converters are registered using dynamo_tensorrt_converter
for dynamo. The function decorated
has the arguments - network, target, args, kwargs, name
, which is common across all the operators schema.
These functions are mapped in the aten
converter registry dictionary (at present a compilation of FX and dynamo converters, FX will be deprecated soon), with key as the function target name.
aten_trace is produced by
torch_tensorrt.dynamo.trace(..)
for the export path andtorch_tensorrt.compile(ir=dynamo)
for the compile path.The export path makes use of
aten_tracer
whereas the alternate trace in compile is produced by the AOT Autograd library. Both these simplify the torch operators to reduced set of Aten operations.
As mentioned above, if you would like to add a new converter, its implementation will be included in TensorRT/py/torch_tensorrt/dynamo/conversion/impl
Although there is a corresponding implementation of the converters included in the common implementation library present in TensorRT/py/torch_tensorrt/fx/impl
for FX converters, this documentation focuses on the implementation of the aten_ops
converters in dynamo.
Converter implementation¶
In this section, we illustrate the steps to be implemented for writing a converter. We divide them according to activation, operator, lowering pass implementation or evaluator. Each of them is detailed with the help of an example
Registration
The converter needs to be registered with the appropriate op code in the
dynamo_tensorrt_converter
.
Activation type
Example:
leaky_relu
aten_ops_converters: Dynamo_converters
Define in
py/torch_tensorrt/dynamo/conversion/aten_ops_converters
. One needs to register the opcode generated in the trace withdynamo_tensorrt_converter
decorator. Op code to be used for the registration or the converter registry key in this case istorch.ops.aten.leaky_relu.default
@dynamo_tensorrt_converter(torch.ops.aten.leaky_relu.default) def aten_ops_leaky_relu( network: TRTNetwork, target: Target, args: Tuple[Argument, ...], kwargs: Dict[str, Argument], name: str, ) -> Union[TRTTensor, Sequence[TRTTensor]]: return activation.leaky_relu(network, target, SourceIR.ATEN, name, args[0], args[1])The
tensorrt_converter
(used for FX registration) anddynamo_tensorrt_converter
are similar decorator functions with some differences.
Both register the converters in the registeries (python dictionaries) -
CONVERTERS
andDYNAMO_CONVERTERS
respectively. These are two dictioneries which are concatenated to form the overall converter registryThe dictionary is keyed on the
OpOverLoad
which is mentioned in more detail below with examplesBoth return the decorated converter implementation
The
CONVERTERS
directly registers the decoratedconverter_implementation
function, whileDYNAMO_CONVERTERS
has additionational arguments and registers theConverterSupport
objectThe additional arguments are:
key: Node target for which the converter is implemented for (for example, torch.ops.aten.leaky_relu.Tensor)
enabled: Whether the converter should be enabled/cached or not
capability_validator: Function which evaluates whether a node is valid for conversion by the decorated converter. It defaults to None, implying the capability_validator function is always true. This means all nodes of “key” kind can be supported by this converter by default. See
embedding
example for more detailspriority: Converter’s level of priority relative to other converters with the same target
The
ConverterSupport
is a compilation ofconverter_implementation
andcapability_validator
.The function decorated by
tensorrt_converter
anddynamo_tensorrt_converter
has the following arguments which are automatically generated by the trace functions mentioned above.
network : Node in the form of
call_module
orcall_function
having the target as the keytarget: Target key in the
call_module
orcall_function
above. eg:torch.ops.aten_.leaky_relu.default
. Note thattorch.ops.aten._leaky_relu
is theOpOverloadPacket
whiletorch.ops.aten_.leaky_relu.default
isOpOverload
.args: The arguments passed in the
call_module
orcall_function
abovekwargs: The kwargs passed in the
call_module
orcall_function
abovename: String containing the name of the target
As a user writing new converters, one just needs to take care that the approriate arguments are extracted from the trace generated to the implementation function in the implementation lib function
activation.leaky_relu
(which we will discuss below in detail).Operation type
Example:
fmod
It follows the same steps as the above converter. In this case the opcode is
torch.ops.aten.fmod.Scalar
ortorch.ops.aten.fmod.Tensor
. Hence both the opcodes are registered inpy/torch_tensorrt/dynamo/conversion/aten_ops_converters
. Note thattorch.ops.aten.fmod
is theOpOverLoadPacket
while the registry is keyed ontorch.ops.aten.fmod.Scalar
ortorch.ops.aten.fmod.Tensor
, which isOpOverLoad
Example:
embedding
It follows the same steps as the above converter. In this case the opcode is
torch.ops.aten.embedding.default
. There are some converters which have special cases to be accounted for. In those cases, one should usecapability_validators
to register the converter using@dynamo_tensorrt_converter
We illustrate this throughtorch.ops.aten.embedding.default
. It has parameters -scale_grad_by_freq
andsparse
which are not currently supported by the implementation. In such cases we can write validatorembedding_param_validator
which implements that given those paramters the converter is not supported and register the converter bySo if there is a new converter in which certain special cases are not to be supported then they can be specified in the
capability_validator
.Evaluator type
Example:
operator.getitem
Evaluators are categorized as so since they do not make any modification to the graph. This is implemented in
py/torch_tensorrt/dynamo/conversion/op_evaluators.py
, with the correspondingcapbility_validator
. The opcode isoperator.getitem
.Implementation Library
The dynamo converters would be located in
py/torch_tensorrt/dynamo/conversion/impl
Activation
Example:
leaky_relu
The implementation is to be placed in present in
py/torch_tensorrt/dynamo/conversion/impl/activation.py
. This is where all the activation functions are defined and implemented.def leaky_relu( network: TRTNetwork, target: Target, source_ir: Optional[SourceIR], name: str, input_val: TRTTensor, alpha: Optional[Any], ): #implementationThe implementation function has the following arguments.
network :
network
passed from the decorated function registrationtarget:
target
passed from the decorated function registrationsource_ir: Enum attribute.
SourceIR
enum is defined inpy/torch_tensorrt/dynamo/conversion/impl/converter_utils
name:
name
passed from the decorated function registrationinput_val: Approriate arguments extracted from the decorated function registration from args or kwargs
alpha: Approriate arguments extracted from the decorated function registration from args or kwargs. If not None, it will set the alpha attribute of the created TensorRT activation layer eg: Used in leaky_relu, elu, hardtanh
beta: Approriate arguments extracted from the decorated function registration from args or kwargs. If not None, it will set the beta attribute of the created TensorRT activation layer eg: Used in hardtanh
dyn_range_fn: A optional function which takes the dynamic range of a TensorRT Tensor and returns the output dynamic range
The implementation functions call the
convert_activation
function inpy/torch_tensorrt/dynamo/conversion/impl/activation.py
. This function will add the approriate activation layer vianetwork.add_activation
.Operator
The implementation is to be placed in
py/torch_tensorrt/dynamo/conversion/impl/elementwise/ops.py
for dynamo. This is where all the elementwise functions are defined and implemented. For a new operator, one should identify the category to which it belongs. Following are some examples
Elementwise operators like
fmod
is present inpy/torch_tensorrt/dynamo/conversion/impl/elementwise
. Thepy/torch_tensorrt/dynamo/conversion/impl/elementwise/base
contains base functions for elementwise operator.Unary operators like
sqrt
will be present inpy/torch_tensorrt/dynamo/conversion/impl/unary
. Thepy/torch_tensorrt/dynamo/conversion/impl/unary/base
contains base functions for unary operator.Normalization operators like
softmax
,layer_norm
,batch_norm
will be present inpy/torch_tensorrt/dynamo/conversion/impl/normalization
. Since there are no base operations common to all, there is no base file. But one can choose to implement a base file, if there are common functions across all normalization operationsIndividual operators like
slice
,select
,where
,embedding
will be present inpy/torch_tensorrt/dynamo/conversion/impl/*.py
. They will have individual operator implementation with the same API structure as above but with different individual argumentsPlease note that the above operators would have common functions to be implemented which should be placed in
py/torch_tensorrt/dynamo/conversion/impl/converter_utils.py
Lowering type
There are some converters which can be decomposed into suboperations and need not have seperate converter registration. Such converters can be implemented via
lowering passes
Example:
addmm
The decompositions are registered via
register_decomposition
inpy/torch_tensorrt/dynamo/backend/lowering/_decompositions.py
We defineaddmm_replacement
and replace it with the torch ops, which will have their corresponding converters called.@register_decomposition(torch.ops.aten.addmm, registry=DECOMPOSITIONS) def addmm_replacement( input_: torch.Tensor, mat1: torch.Tensor, mat2: torch.Tensor, *, beta=1, alpha=1 ) -> torch.Tensor: return torch.add( torch.mul(input_, beta), torch.mul(torch.matmul(mat1, mat2), alpha) )Note that there are some pre-existing dynamo decompositions in torch directory, in which case they should be used, In that case please enable the decompositions in
py/torch_tensorrt/dynamo/lowering/_decomposition_groups.py
intorch_enabled_decompositions
. Similarly you can choose to disable any intorch_disabled_decompositions
. Please note that the ones already defined in the lowering will take precedence over torch lowering ops.
Tests¶
Dynamo testing:
Dynamo tests are present for the lowering ops in
tests/py/dynamo/lowering/test_decompositions.py
. The above converters will soon be ported to dynamo testsCompare the results for
fx.symbolic_trace
andtorch_tensorrt.dynamo.compile
.Test for the
expected_op
and theunexpected_op
.expected_op
: Operations the operations are lowered to. eg:mul
andadd
foraddmm
Note that specify that
disable_passes= True
for cases where you would not want lowering passes (which should be the default when testing converters)unexpected_op
: Original operation. eg:addmm
foraddmm
The tests should fail if any of the above two conditions fail