.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "beginner/onnx/onnx_registry_tutorial.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_beginner_onnx_onnx_registry_tutorial.py: `Introduction to ONNX `_ || `Exporting a PyTorch model to ONNX `_ || **Extending the ONNX Registry** Extending the ONNX Registry =========================== **Authors:** Ti-Tai Wang (titaiwang@microsoft.com) .. GENERATED FROM PYTHON SOURCE LINES 16-78 Overview -------- This tutorial is an introduction to ONNX registry, which empowers users to implement new ONNX operators or even replace existing operators with a new implementation. During the model export to ONNX, the PyTorch model is lowered to an intermediate representation composed of `ATen operators `_. While ATen operators are maintained by PyTorch core team, it is the responsibility of the ONNX exporter team to independently implement each of these operators to ONNX through `ONNX Script `_. The users can also replace the behavior implemented by the ONNX exporter team with their own implementation to fix bugs or improve performance for a specific ONNX runtime. The ONNX Registry manages the mapping between PyTorch operators and the ONNX operators counterparts and provides APIs to extend the registry. In this tutorial, we will cover three scenarios that require extending the ONNX registry with custom operators: * Unsupported ATen operators * Custom operators with existing ONNX Runtime support * Custom operators without ONNX Runtime support Unsupported ATen operators -------------------------- Although the ONNX exporter team does their best efforts to support all ATen operators, some of them might not be supported yet. In this section, we will demonstrate how you can add unsupported ATen operators to the ONNX Registry. .. note:: The steps to implement unsupported ATen operators are the same to replace the implementation of an existing ATen operator with a custom implementation. Because we don't actually have an unsupported ATen operator to use in this tutorial, we are going to leverage this and replace the implementation of ``aten::add.Tensor`` with a custom implementation the same way we would if the operator was not present in the ONNX Registry. When a model cannot be exported to ONNX due to an unsupported operator, the ONNX exporter will show an error message similar to: .. code-block:: python RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. The error message indicates that the fully qualified name of unsupported ATen operator is ``aten::add.Tensor``. The fully qualified name of an operator is composed of the namespace, operator name, and overload following the format ``namespace::operator_name.overload``. To add support for an unsupported ATen operator or to replace the implementation for an existing one, we need: * The fully qualified name of the ATen operator (e.g. ``aten::add.Tensor``). This information is always present in the error message as show above. * The implementation of the operator using `ONNX Script `__. ONNX Script is a prerequisite for this tutorial. Please make sure you have read the `ONNX Script tutorial `_ before proceeding. Because ``aten::add.Tensor`` is already supported by the ONNX Registry, we will demonstrate how to replace it with a custom implementation, but keep in mind that the same steps apply to support new unsupported ATen operators. This is possible because the :class:`OnnxRegistry` allows users to override an operator registration. We will override the registration of ``aten::add.Tensor`` with our custom implementation and verify it exists. .. GENERATED FROM PYTHON SOURCE LINES 78-121 .. code-block:: default import torch import onnxruntime import onnxscript from onnxscript import opset18 # opset 18 is the latest (and only) supported version for now class Model(torch.nn.Module): def forward(self, input_x, input_y): return torch.ops.aten.add(input_x, input_y) # generates a aten::add.Tensor node input_add_x = torch.randn(3, 4) input_add_y = torch.randn(3, 4) aten_add_model = Model() # Now we create a ONNX Script function that implements ``aten::add.Tensor``. # The function name (e.g. ``custom_aten_add``) is displayed in the ONNX graph, so we recommend to use intuitive names. custom_aten = onnxscript.values.Opset(domain="custom.aten", version=1) # NOTE: The function signature must match the signature of the unsupported ATen operator. # https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/native_functions.yaml # NOTE: All attributes must be annotated with type hints. @onnxscript.script(custom_aten) def custom_aten_add(input_x, input_y, alpha: float = 1.0): alpha = opset18.CastLike(alpha, input_y) input_y = opset18.Mul(input_y, alpha) return opset18.Add(input_x, input_y) # Now we have everything we need to support unsupported ATen operators. # Let's register the ``custom_aten_add`` function to ONNX registry, and export the model to ONNX again. onnx_registry = torch.onnx.OnnxRegistry() onnx_registry.register_op( namespace="aten", op_name="add", overload="Tensor", function=custom_aten_add ) print(f"aten::add.Tensor is supported by ONNX registry: \ {onnx_registry.is_registered_op(namespace='aten', op_name='add', overload='Tensor')}" ) export_options = torch.onnx.ExportOptions(onnx_registry=onnx_registry) onnx_program = torch.onnx.dynamo_export( aten_add_model, input_add_x, input_add_y, export_options=export_options ) .. rst-class:: sphx-glr-script-out .. code-block:: none /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/onnx/_internal/exporter.py:136: UserWarning: torch.onnx.dynamo_export only implements opset version 18 for now. If you need to use a different opset version, please register them with register_custom_op. aten::add.Tensor is supported by ONNX registry: True /opt/conda/envs/py_3.10/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. .. GENERATED FROM PYTHON SOURCE LINES 122-126 Now let's inspect the model and verify the model has a ``custom_aten_add`` instead of ``aten::add.Tensor``. The graph has one graph node for ``custom_aten_add``, and inside of it there are four function nodes, one for each operator, and one for constant attribute. .. GENERATED FROM PYTHON SOURCE LINES 126-138 .. code-block:: default # graph node domain is the custom domain we registered assert onnx_program.model_proto.graph.node[0].domain == "custom.aten" assert len(onnx_program.model_proto.graph.node) == 1 # graph node name is the function name assert onnx_program.model_proto.graph.node[0].op_type == "custom_aten_add" # function node domain is empty because we use standard ONNX operators assert onnx_program.model_proto.functions[0].node[3].domain == "" # function node name is the standard ONNX operator name assert onnx_program.model_proto.functions[0].node[3].op_type == "Add" .. GENERATED FROM PYTHON SOURCE LINES 139-155 This is how ``custom_aten_add_model`` looks in the ONNX graph using Netron: .. image:: /_static/img/onnx/custom_aten_add_model.png :width: 70% :align: center Inside the ``custom_aten_add`` function, we can see the three ONNX nodes we used in the function (``CastLike``, ``Add``, and ``Mul``), and one ``Constant`` attribute: .. image:: /_static/img/onnx/custom_aten_add_function.png :width: 70% :align: center This was all that we needed to register the new ATen operator into the ONNX Registry. As an additional step, we can use ONNX Runtime to run the model, and compare the results with PyTorch. .. GENERATED FROM PYTHON SOURCE LINES 155-178 .. code-block:: default # Use ONNX Runtime to run the model, and compare the results with PyTorch onnx_program.save("./custom_add_model.onnx") ort_session = onnxruntime.InferenceSession( "./custom_add_model.onnx", providers=['CPUExecutionProvider'] ) def to_numpy(tensor): return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy() onnx_input = onnx_program.adapt_torch_inputs_to_onnx(input_add_x, input_add_y) onnxruntime_input = {k.name: to_numpy(v) for k, v in zip(ort_session.get_inputs(), onnx_input)} onnxruntime_outputs = ort_session.run(None, onnxruntime_input) torch_outputs = aten_add_model(input_add_x, input_add_y) torch_outputs = onnx_program.adapt_torch_outputs_to_onnx(torch_outputs) assert len(torch_outputs) == len(onnxruntime_outputs) for torch_output, onnxruntime_output in zip(torch_outputs, onnxruntime_outputs): torch.testing.assert_close(torch_output, torch.tensor(onnxruntime_output)) .. GENERATED FROM PYTHON SOURCE LINES 179-193 Custom operators with existing ONNX Runtime support --------------------------------------------------- In this case, the user creates a model with standard PyTorch operators, but the ONNX runtime (e.g. Microsoft's ONNX Runtime) can provide a custom implementation for that kernel, effectively replacing the existing implementation in the ONNX Registry. Another use case is when the user wants to use a custom implementation of an existing ONNX operator to fix a bug or improve performance of a specific operator. To achieve this, we only need to register the new implementation with the existing ATen fully qualified name. In the following example, we use the ``com.microsoft.Gelu`` from ONNX Runtime, which is not the same ``Gelu`` from ONNX spec. Thus, we register the Gelu with the namespace ``com.microsoft`` and operator name ``Gelu``. Before we begin, let's check whether ``aten::gelu.default`` is really supported by the ONNX registry. .. GENERATED FROM PYTHON SOURCE LINES 193-199 .. code-block:: default onnx_registry = torch.onnx.OnnxRegistry() print(f"aten::gelu.default is supported by ONNX registry: \ {onnx_registry.is_registered_op(namespace='aten', op_name='gelu', overload='default')}") .. rst-class:: sphx-glr-script-out .. code-block:: none /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/onnx/_internal/exporter.py:136: UserWarning: torch.onnx.dynamo_export only implements opset version 18 for now. If you need to use a different opset version, please register them with register_custom_op. aten::gelu.default is supported by ONNX registry: True .. GENERATED FROM PYTHON SOURCE LINES 200-202 In our example, ``aten::gelu.default`` operator is supported by the ONNX registry, so :meth:`onnx_registry.is_registered_op` returns ``True``. .. GENERATED FROM PYTHON SOURCE LINES 202-233 .. code-block:: default class CustomGelu(torch.nn.Module): def forward(self, input_x): return torch.ops.aten.gelu(input_x) # com.microsoft is an official ONNX Runtime namspace custom_ort = onnxscript.values.Opset(domain="com.microsoft", version=1) # NOTE: The function signature must match the signature of the unsupported ATen operator. # https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/native_functions.yaml # NOTE: All attributes must be annotated with type hints. @onnxscript.script(custom_ort) def custom_aten_gelu(input_x, approximate: str = "none"): # We know com.microsoft::Gelu is supported by ONNX Runtime # It's only not supported by ONNX return custom_ort.Gelu(input_x) onnx_registry = torch.onnx.OnnxRegistry() onnx_registry.register_op( namespace="aten", op_name="gelu", overload="default", function=custom_aten_gelu) export_options = torch.onnx.ExportOptions(onnx_registry=onnx_registry) aten_gelu_model = CustomGelu() input_gelu_x = torch.randn(3, 3) onnx_program = torch.onnx.dynamo_export( aten_gelu_model, input_gelu_x, export_options=export_options ) .. rst-class:: sphx-glr-script-out .. code-block:: none /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/onnx/_internal/exporter.py:136: UserWarning: torch.onnx.dynamo_export only implements opset version 18 for now. If you need to use a different opset version, please register them with register_custom_op. .. GENERATED FROM PYTHON SOURCE LINES 234-239 Let's inspect the model and verify the model uses :func:`custom_aten_gelu` instead of :class:`aten::gelu`. Note the graph has one graph nodes for ``custom_aten_gelu``, and inside ``custom_aten_gelu``, there is a function node for ``Gelu`` with namespace ``com.microsoft``. .. GENERATED FROM PYTHON SOURCE LINES 239-250 .. code-block:: default # graph node domain is the custom domain we registered assert onnx_program.model_proto.graph.node[0].domain == "com.microsoft" # graph node name is the function name assert onnx_program.model_proto.graph.node[0].op_type == "custom_aten_gelu" # function node domain is the custom domain we registered assert onnx_program.model_proto.functions[0].node[0].domain == "com.microsoft" # function node name is the node name used in the function assert onnx_program.model_proto.functions[0].node[0].op_type == "Gelu" .. GENERATED FROM PYTHON SOURCE LINES 251-265 The following diagram shows ``custom_aten_gelu_model`` ONNX graph using Netron: .. image:: /_static/img/onnx/custom_aten_gelu_model.png :width: 70% :align: center Inside the ``custom_aten_gelu`` function, we can see the ``Gelu`` node from module ``com.microsoft`` used in the function: .. image:: /_static/img/onnx/custom_aten_gelu_function.png That is all we need to do. As an additional step, we can use ONNX Runtime to run the model, and compare the results with PyTorch. .. GENERATED FROM PYTHON SOURCE LINES 265-285 .. code-block:: default onnx_program.save("./custom_gelu_model.onnx") ort_session = onnxruntime.InferenceSession( "./custom_gelu_model.onnx", providers=['CPUExecutionProvider'] ) def to_numpy(tensor): return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy() onnx_input = onnx_program.adapt_torch_inputs_to_onnx(input_gelu_x) onnxruntime_input = {k.name: to_numpy(v) for k, v in zip(ort_session.get_inputs(), onnx_input)} onnxruntime_outputs = ort_session.run(None, onnxruntime_input) torch_outputs = aten_gelu_model(input_gelu_x) torch_outputs = onnx_program.adapt_torch_outputs_to_onnx(torch_outputs) assert len(torch_outputs) == len(onnxruntime_outputs) for torch_output, onnxruntime_output in zip(torch_outputs, onnxruntime_outputs): torch.testing.assert_close(torch_output, torch.tensor(onnxruntime_output)) .. GENERATED FROM PYTHON SOURCE LINES 286-308 Custom operators without ONNX Runtime support --------------------------------------------- In this case, the operator is not supported by any ONNX runtime, but we would like to use it as custom operator in ONNX graph. Therefore, we need to implement the operator in three places: 1. PyTorch FX graph 2. ONNX Registry 3. ONNX Runtime In the following example, we would like to use a custom operator that takes one tensor input, and returns one output. The operator adds the input to itself, and returns the rounded result. Custom Ops Registration in PyTorch FX Graph (Beta) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Firstly, we need to implement the operator in PyTorch FX graph. This can be done by using ``torch._custom_op``. .. GENERATED FROM PYTHON SOURCE LINES 308-334 .. code-block:: default # NOTE: This is a beta feature in PyTorch, and is subject to change. from torch._custom_op import impl as custom_op @custom_op.custom_op("mylibrary::addandround_op") def addandround_op(tensor_x: torch.Tensor) -> torch.Tensor: ... @addandround_op.impl_abstract() def addandround_op_impl_abstract(tensor_x): return torch.empty_like(tensor_x) @addandround_op.impl("cpu") def addandround_op_impl(tensor_x): return torch.round(tensor_x + tensor_x) # add x to itself, and round the result torch._dynamo.allow_in_graph(addandround_op) class CustomFoo(torch.nn.Module): def forward(self, tensor_x): return addandround_op(tensor_x) input_addandround_x = torch.randn(3) custom_addandround_model = CustomFoo() .. GENERATED FROM PYTHON SOURCE LINES 335-344 Custom Ops Registration in ONNX Registry ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For the step 2 and 3, we need to implement the operator in ONNX registry. In this example, we will implement the operator in ONNX registry with the namespace ``test.customop`` and operator name ``CustomOpOne``, and ``CustomOpTwo``. These two ops are registered and built in `cpu_ops.cc `__. .. GENERATED FROM PYTHON SOURCE LINES 345-374 .. code-block:: default custom_opset = onnxscript.values.Opset(domain="test.customop", version=1) # NOTE: The function signature must match the signature of the unsupported ATen operator. # https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/native_functions.yaml # NOTE: All attributes must be annotated with type hints. @onnxscript.script(custom_opset) def custom_addandround(input_x): # The same as opset18.Add(x, x) add_x = custom_opset.CustomOpOne(input_x, input_x) # The same as opset18.Round(x, x) round_x = custom_opset.CustomOpTwo(add_x) # Cast to FLOAT to match the ONNX type return opset18.Cast(round_x, to=1) onnx_registry = torch.onnx.OnnxRegistry() onnx_registry.register_op( namespace="mylibrary", op_name="addandround_op", overload="default", function=custom_addandround ) export_options = torch.onnx.ExportOptions(onnx_registry=onnx_registry) onnx_program = torch.onnx.dynamo_export( custom_addandround_model, input_addandround_x, export_options=export_options ) onnx_program.save("./custom_addandround_model.onnx") .. rst-class:: sphx-glr-script-out .. code-block:: none /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/onnx/_internal/exporter.py:136: UserWarning: torch.onnx.dynamo_export only implements opset version 18 for now. If you need to use a different opset version, please register them with register_custom_op. .. GENERATED FROM PYTHON SOURCE LINES 375-379 The ``onnx_program`` exposes the exported model as protobuf through ``onnx_program.model_proto``. The graph has one graph nodes for ``custom_addandround``, and inside ``custom_addandround``, there are two function nodes, one for each operator. .. GENERATED FROM PYTHON SOURCE LINES 379-388 .. code-block:: default assert onnx_program.model_proto.graph.node[0].domain == "test.customop" assert onnx_program.model_proto.graph.node[0].op_type == "custom_addandround" assert onnx_program.model_proto.functions[0].node[0].domain == "test.customop" assert onnx_program.model_proto.functions[0].node[0].op_type == "CustomOpOne" assert onnx_program.model_proto.functions[0].node[1].domain == "test.customop" assert onnx_program.model_proto.functions[0].node[1].op_type == "CustomOpTwo" .. GENERATED FROM PYTHON SOURCE LINES 389-469 This is how ``custom_addandround_model`` ONNX graph looks using Netron: .. image:: /_static/img/onnx/custom_addandround_model.png :width: 70% :align: center Inside the ``custom_addandround`` function, we can see the two custom operators we used in the function (``CustomOpOne``, and ``CustomOpTwo``), and they are from module ``test.customop``: .. image:: /_static/img/onnx/custom_addandround_function.png Custom Ops Registration in ONNX Runtime ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To link your custom op library to ONNX Runtime, you need to compile your C++ code into a shared library and link it to ONNX Runtime. Follow the instructions below: 1. Implement your custom op in C++ by following `ONNX Runtime instructions <`https://github.com/microsoft/onnxruntime/blob/gh-pages/docs/reference/operators/add-custom-op.md>`__. 2. Download ONNX Runtime source distribution from `ONNX Runtime releases `__. 3. Compile and link your custom op library to ONNX Runtime, for example: .. code-block:: bash $ gcc -shared -o libcustom_op_library.so custom_op_library.cc -L /path/to/downloaded/ort/lib/ -lonnxruntime -fPIC 4. Run the model with ONNX Runtime Python API and compare the results with PyTorch. .. code-block:: python ort_session_options = onnxruntime.SessionOptions() # NOTE: Link the custom op library to ONNX Runtime and replace the path # with the path to your custom op library ort_session_options.register_custom_ops_library( "/path/to/libcustom_op_library.so" ) ort_session = onnxruntime.InferenceSession( "./custom_addandround_model.onnx", providers=['CPUExecutionProvider'], sess_options=ort_session_options) def to_numpy(tensor): return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy() onnx_input = onnx_program.adapt_torch_inputs_to_onnx(input_addandround_x) onnxruntime_input = {k.name: to_numpy(v) for k, v in zip(ort_session.get_inputs(), onnx_input)} onnxruntime_outputs = ort_session.run(None, onnxruntime_input) torch_outputs = custom_addandround_model(input_addandround_x) torch_outputs = onnx_program.adapt_torch_outputs_to_onnx(torch_outputs) assert len(torch_outputs) == len(onnxruntime_outputs) for torch_output, onnxruntime_output in zip(torch_outputs, onnxruntime_outputs): torch.testing.assert_close(torch_output, torch.tensor(onnxruntime_output)) Conclusion ---------- Congratulations! In this tutorial, we explored the :class:`ONNXRegistry` API and discovered how to create custom implementations for unsupported or existing ATen operators using ONNX Script. Finally, we leveraged ONNX Runtime to execute the model and compare the results with PyTorch, providing us with a comprehensive understanding of handling unsupported operators in the ONNX ecosystem. Further reading --------------- The list below refers to tutorials that ranges from basic examples to advanced scenarios, not necessarily in the order they are listed. Feel free to jump directly to specific topics of your interest or sit tight and have fun going through all of them to learn all there is about the ONNX exporter. .. include:: /beginner_source/onnx/onnx_toc.txt .. toctree:: :hidden: .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 1.055 seconds) .. _sphx_glr_download_beginner_onnx_onnx_registry_tutorial.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: onnx_registry_tutorial.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: onnx_registry_tutorial.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_