Note

Click here to download the full example code

Introduction to ONNX || Exporting a PyTorch model to ONNX || Extending the ONNX exporter operator support || Export a model with control flow to ONNX

Extending the ONNX Exporter Operator Support

Created On: Oct 06, 2023 | Last Updated: Mar 05, 2025 | Last Verified: Nov 05, 2024

Authors: Ti-Tai Wang, Justin Chu

Overview

This tutorial describes how you can create ONNX implementation for unsupported PyTorch operators or replace existing implementation with your own.

We will cover three scenarios that require extending the ONNX exporter’s operator support:

Overriding the implementation of an existing PyTorch operator
Using custom ONNX operators
Supporting a custom PyTorch operator

What you will learn:

How to override or add support for PyTorch operators in ONNX.
How to integrate custom ONNX operators for specialized runtimes.
How to implement and translate custom PyTorch operators to ONNX.

Prerequisites

Before starting this tutorial, make sure you have completed the following prerequisites:

torch >= 2.6
The target PyTorch operator
Completed the ONNX Script tutorial before proceeding
The implementation of the operator using ONNX Script

Overriding the implementation of an existing PyTorch operator

Although the ONNX exporter team does their best efforts to support all PyTorch operators, some of them might not be supported yet. In this section, we will demonstrate how you can add unsupported PyTorch operators to the ONNX Registry.

Note

The steps to implement unsupported PyTorch operators are the same as those for replacing the implementation of an existing PyTorch operator with a custom one. Because we don’t actually have an unsupported PyTorch operator to use in this tutorial, we are going to leverage this and replace the implementation of torch.ops.aten.add.Tensor with a custom implementation the same way we would if the operator was not implemented by the ONNX exporter.

When a model cannot be exported to ONNX due to an unsupported operator, the ONNX exporter will show an error message similar to:

No decompositions registered for [...]

The error message indicates that the unsupported PyTorch operator is torch.ops.aten.add.Tensor. The operator is of type <class 'torch._ops.OpOverload'>, and this operator is what we will use as the target to register our custom implementation.

import torch
import onnxscript

# Opset 18 is the standard supported version as of PyTorch 2.6
from onnxscript import opset18 as op


# Create a model that uses the operator torch.ops.aten.add.Tensor
class Model(torch.nn.Module):
    def forward(self, input_x, input_y):
        return torch.ops.aten.add.Tensor(input_x, input_y)


# NOTE: The function signature (including parameter names) must match the signature of the unsupported PyTorch operator.
# https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/native_functions.yaml
# All attributes must be annotated with type hints.
def custom_aten_add(self, other, alpha: float = 1.0):
    if alpha != 1.0:
        alpha = op.CastLike(alpha, other)
        other = op.Mul(other, alpha)
    # To distinguish the custom implementation from the builtin one, we switch the order of the inputs
    return op.Add(other, self)


x = torch.tensor([1.0])
y = torch.tensor([2.0])

# Then we provide the custom implementation to the ONNX exporter as a ``custom_translation_table``.
onnx_program = torch.onnx.export(
    Model().eval(),
    (x, y),
    dynamo=True,
    custom_translation_table={
        torch.ops.aten.add.Tensor: custom_aten_add,
    },
)
# Optimize the ONNX graph to remove redundant nodes
onnx_program.optimize()

/usr/local/lib/python3.10/dist-packages/onnxscript/converter.py:823: FutureWarning:

'onnxscript.values.Op.param_schemas' is deprecated in version 0.1 and will be removed in the future. Please use '.op_signature' instead.

/usr/local/lib/python3.10/dist-packages/onnxscript/converter.py:823: FutureWarning:

'onnxscript.values.OnnxFunction.param_schemas' is deprecated in version 0.1 and will be removed in the future. Please use '.op_signature' instead.

[torch.onnx] Obtain model graph for `Model()` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `Model()` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅

Now let’s inspect the model and verify the model is using the custom implementation.

print(onnx_program.model)

<
    ir_version=10,
    opset_imports={'pkg.onnxscript.torch_lib.common': 1, '': 18},
    producer_name='pytorch',
    producer_version='2.6.0+cu124',
    domain=None,
    model_version=None,
>
graph(
    name=main_graph,
    inputs=(
        %"input_x"<FLOAT,[1]>,
        %"input_y"<FLOAT,[1]>
    ),
    outputs=(
        %"add"<FLOAT,[1]>
    ),
) {
    0 |  # node_Add_0
         %"add"<FLOAT,[1]> ⬅️ ::Add(%"input_y", %"input_x")
    return %"add"<FLOAT,[1]>
}

The translation is using our custom implementation: In node node_Add_0, input_y now comes first, and input_x comes second.

We can use ONNX Runtime to run the model and verify the results by calling the torch.onnx.ONNXProgram directly on the input tensors.

result = onnx_program(x, y)[0]
torch.testing.assert_close(result, torch.tensor([3.0]))

Using custom ONNX operators

In this case, we create a model with standard PyTorch operators, but the runtime (such as Microsoft’s ONNX Runtime) can provide a custom implementation for that kernel, effectively replacing the existing implementation.

In the following example, we use the com.microsoft.Gelu operator provided by ONNX Runtime, which is not the same Gelu from ONNX spec.

class GeluModel(torch.nn.Module):
    def forward(self, input_x):
        return torch.ops.aten.gelu(input_x)


# Create a namespace for the custom operator using ONNX Script
# ``com.microsoft`` is an official ONNX Runtime namespace
microsoft_op = onnxscript.values.Opset(domain="com.microsoft", version=1)

# NOTE: The function signature (including parameter names) must match the signature of the unsupported PyTorch operator.
# https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/native_functions.yaml
# NOTE: All attributes must be annotated with type hints.
# The function must be scripted using the ``@onnxscript.script()`` decorator when
# using operators from custom domains. This may be improved in future versions.
from onnxscript import FLOAT


@onnxscript.script(microsoft_op)
def custom_aten_gelu(self: FLOAT, approximate: str = "none") -> FLOAT:
    return microsoft_op.Gelu(self)


onnx_program = torch.onnx.export(
    GeluModel().eval(),
    (x,),
    dynamo=True,
    custom_translation_table={
        torch.ops.aten.gelu.default: custom_aten_gelu,
    },
)

# Optimize the ONNX graph to remove redundant nodes
onnx_program.optimize()

'Gelu' is not a known op in 'com.microsoft'
[torch.onnx] Obtain model graph for `GeluModel()` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `GeluModel()` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅

Let’s inspect the model and verify the model uses op_type Gelu from namespace com.microsoft.

print(onnx_program.model)

<
    ir_version=10,
    opset_imports={'pkg.onnxscript.torch_lib.common': 1, 'com.microsoft': 1, '': 18},
    producer_name='pytorch',
    producer_version='2.6.0+cu124',
    domain=None,
    model_version=None,
>
graph(
    name=main_graph,
    inputs=(
        %"input_x"<FLOAT,[1]>
    ),
    outputs=(
        %"gelu"<FLOAT,[1]>
    ),
) {
    0 |  # n0
         %"gelu"<FLOAT,[1]> ⬅️ com.microsoft::Gelu(%"input_x")
    return %"gelu"<FLOAT,[1]>
}

Similar to the previous example, we can use ONNX Runtime to run the model and verify the results.

result = onnx_program(x)[0]
torch.testing.assert_close(result, torch.ops.aten.gelu(x))

Supporting a custom PyTorch operator

In this case, the operator is an operator that is user implemented and registered to PyTorch.

In the following example, we would like to use a custom operator that takes one tensor input, and returns one output. The operator adds the input to itself, and returns the rounded result.

Firstly, we assume the custom operator is implemented and registered with torch.library.custom_op(). You can refer to Creating new custom ops in Python for a detailed guide on how to create custom operators.

# Define and use the operator in PyTorch
@torch.library.custom_op("mylibrary::add_and_round_op", mutates_args=())
def add_and_round_op(input: torch.Tensor) -> torch.Tensor:
    return torch.round(input + input)


@add_and_round_op.register_fake
def _add_and_round_op_fake(tensor_x):
    return torch.empty_like(tensor_x)


class AddAndRoundModel(torch.nn.Module):
    def forward(self, input):
        return add_and_round_op(input)


# Implement the custom operator in ONNX using ONNX Script
def onnx_add_and_round(input):
    return op.Round(op.Add(input, input))


onnx_program = torch.onnx.export(
    AddAndRoundModel().eval(),
    (x,),
    dynamo=True,
    custom_translation_table={
        torch.ops.mylibrary.add_and_round_op.default: onnx_add_and_round,
    },
)

# Optimize the ONNX graph to remove redundant nodes
onnx_program.optimize()
print(onnx_program)

[torch.onnx] Obtain model graph for `AddAndRoundModel()` with `torch.export.export(..., strict=False)`...
[torch.onnx] Obtain model graph for `AddAndRoundModel()` with `torch.export.export(..., strict=False)`... ✅
[torch.onnx] Run decomposition...
[torch.onnx] Run decomposition... ✅
[torch.onnx] Translate the graph into ONNX...
[torch.onnx] Translate the graph into ONNX... ✅
ONNXProgram(
    model=
        <
            ir_version=10,
            opset_imports={'pkg.onnxscript.torch_lib.common': 1, '': 18},
            producer_name='pytorch',
            producer_version='2.6.0+cu124',
            domain=None,
            model_version=None,
        >
        graph(
            name=main_graph,
            inputs=(
                %"input"<FLOAT,[1]>
            ),
            outputs=(
                %"add_and_round_op"<FLOAT,[1]>
            ),
        ) {
            0 |  # node_Add_0
                 %"val_0"<FLOAT,[1]> ⬅️ ::Add(%"input", %"input")
            1 |  # node_Round_1
                 %"add_and_round_op"<FLOAT,[1]> ⬅️ ::Round(%"val_0")
            return %"add_and_round_op"<FLOAT,[1]>
        }


    ,
    exported_program=
        ExportedProgram:
            class GraphModule(torch.nn.Module):
                def forward(self, input: "f32[1]"):
                    input_1 = input

                     # File: /var/lib/workspace/beginner_source/onnx/onnx_registry_tutorial.py:215 in forward, code: return add_and_round_op(input)
                    add_and_round_op: "f32[1]" = torch.ops.mylibrary.add_and_round_op.default(input_1);  input_1 = None
                    return (add_and_round_op,)

        Graph signature: ExportGraphSignature(input_specs=[InputSpec(kind=<InputKind.USER_INPUT: 1>, arg=TensorArgument(name='input'), target=None, persistent=None)], output_specs=[OutputSpec(kind=<OutputKind.USER_OUTPUT: 1>, arg=TensorArgument(name='add_and_round_op'), target=None)])
        Range constraints: {}

)

The translation is using our custom implementation to translate the torch.ops.mylibrary.add_and_round_op.default operator in the torch.export.ExportedProgram` to the ONNX operator Add and Round.

Finally we verify the results.

result = onnx_program(x)[0]
torch.testing.assert_close(result, add_and_round_op(x))

Conclusion

Congratulations! In this tutorial, we explored the custom_translation_table option and discovered how to create custom implementations for unsupported or existing PyTorch operators using ONNX Script.

Finally, we leveraged ONNX Runtime to execute the model and compare the results with PyTorch, providing us with a comprehensive understanding of handling unsupported operators in the ONNX ecosystem.

Extending the ONNX Exporter Operator Support

Overview

Prerequisites

Overriding the implementation of an existing PyTorch operator

Using custom ONNX operators

Supporting a custom PyTorch operator

Conclusion

Further reading

Docs

Tutorials

Resources