.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "prototype/torchscript_freezing.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_prototype_torchscript_freezing.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_prototype_torchscript_freezing.py:


Model Freezing in TorchScript
=============================

In this tutorial, we introduce the syntax for *model freezing* in TorchScript.
Freezing is the process of inlining Pytorch module parameters and attributes
values into the TorchScript internal representation. Parameter and attribute
values are treated as final values and they cannot be modified in the resulting
Frozen module.

Basic Syntax
------------
Model freezing can be invoked using API below:

 ``torch.jit.freeze(mod : ScriptModule, names : str[]) -> ScriptModule``

Note the input module can either be the result of scripting or tracing.
See https://pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html

Next, we demonstrate how freezing works using an example:

.. GENERATED FROM PYTHON SOURCE LINES 111-117

On my machine, I measured the time:

* Scripted - Warm up time:  0.0107
* Frozen   - Warm up time:  0.0048
* Scripted - Inference:  1.35
* Frozen   - Inference time:  1.17

.. GENERATED FROM PYTHON SOURCE LINES 119-129

In our example, warm up time measures the first two runs. The frozen model
is 50% faster than the scripted model. On some more complex models, we
observed even higher speed up of warm up time. freezing achieves this speed up
because it is doing some the work TorchScript has to do when the first couple
runs are initiated.

Inference time measures inference execution time after the model is warmed up.
Although we observed significant variation in execution time, the
frozen model is often about 15% faster than the scripted model. When input is larger,
we observe a smaller speed up because the execution is dominated by tensor operations.

.. GENERATED FROM PYTHON SOURCE LINES 131-135

Conclusion
-----------
In this tutorial, we learned about model freezing. Freezing is a useful technique to
optimize models for inference and it also can significantly reduce TorchScript warmup time.

.. GENERATED FROM PYTHON SOURCE LINES 135-136

.. code-block:: default


    # %%%%%%RUNNABLE_CODE_REMOVED%%%%%%

.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.000 seconds)


.. _sphx_glr_download_prototype_torchscript_freezing.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: torchscript_freezing.py <torchscript_freezing.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: torchscript_freezing.ipynb <torchscript_freezing.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_