.. note::
    :class: sphx-glr-download-link-note

    Click :ref:`here <sphx_glr_download_beginner_former_torchies_autograd_tutorial.py>` to download the full example code
.. rst-class:: sphx-glr-example-title

.. _sphx_glr_beginner_former_torchies_autograd_tutorial.py:


Autograd
========

Autograd is now a core torch package for automatic differentiation.
It uses a tape based system for automatic differentiation.

In the forward phase, the autograd tape will remember all the operations
it executed, and in the backward phase, it will replay the operations.

Tensors that track history
--------------------------

In autograd, if any input ``Tensor`` of an operation has ``requires_grad=True``,
the computation will be tracked. After computing the backward pass, a gradient
w.r.t. this tensor is accumulated into ``.grad`` attribute.

There’s one more class which is very important for autograd
implementation - a ``Function``. ``Tensor`` and ``Function`` are
interconnected and build up an acyclic graph, that encodes a complete
history of computation. Each variable has a ``.grad_fn`` attribute that
references a function that has created a function (except for Tensors
created by the user - these have ``None`` as ``.grad_fn``).

If you want to compute the derivatives, you can call ``.backward()`` on
a ``Tensor``. If ``Tensor`` is a scalar (i.e. it holds a one element
tensor), you don’t need to specify any arguments to ``backward()``,
however if it has more elements, you need to specify a ``grad_output``
argument that is a tensor of matching shape.


.. code-block:: python


    import torch


Create a tensor and set requires_grad=True to track computation with it


.. code-block:: python

    x = torch.ones(2, 2, requires_grad=True)
    print(x)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    tensor([[1., 1.],
            [1., 1.]], requires_grad=True)


.. code-block:: python

    print(x.data)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    tensor([[1., 1.],
            [1., 1.]])


.. code-block:: python

    print(x.grad)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    None


.. code-block:: python


    print(x.grad_fn)  # we've created x ourselves


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    None


Do an operation of x:


.. code-block:: python


    y = x + 2
    print(y)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    tensor([[3., 3.],
            [3., 3.]], grad_fn=<AddBackward0>)


y was created as a result of an operation,
so it has a grad_fn


.. code-block:: python

    print(y.grad_fn)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    <AddBackward0 object at 0x7f82b901b748>


More operations on y:


.. code-block:: python


    z = y * y * 3
    out = z.mean()

    print(z, out)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    tensor([[27., 27.],
            [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward1>)


``.requires_grad_( ... )`` changes an existing Tensor's ``requires_grad``
flag in-place. The input flag defaults to ``True`` if not given.


.. code-block:: python

    a = torch.randn(2, 2)
    a = ((a * 3) / (a - 1))
    print(a.requires_grad)
    a.requires_grad_(True)
    print(a.requires_grad)
    b = (a * a).sum()
    print(b.grad_fn)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    False
    True
    <SumBackward0 object at 0x7f82b912f828>


Gradients
---------

let's backprop now and print gradients d(out)/dx


.. code-block:: python


    out.backward()
    print(x.grad)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    tensor([[4.5000, 4.5000],
            [4.5000, 4.5000]])


By default, gradient computation flushes all the internal buffers
contained in the graph, so if you even want to do the backward on some
part of the graph twice, you need to pass in ``retain_variables = True``
during the first pass.


.. code-block:: python


    x = torch.ones(2, 2, requires_grad=True)
    y = x + 2
    y.backward(torch.ones(2, 2), retain_graph=True)
    # the retain_variables flag will prevent the internal buffers from being freed
    print(x.grad)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    tensor([[1., 1.],
            [1., 1.]])


.. code-block:: python

    z = y * y
    print(z)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    tensor([[9., 9.],
            [9., 9.]], grad_fn=<MulBackward0>)


just backprop random gradients


.. code-block:: python


    gradient = torch.randn(2, 2)

    # this would fail if we didn't specify
    # that we want to retain variables
    y.backward(gradient)

    print(x.grad)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    tensor([[-0.3270,  1.4292],
            [-0.6227, -0.1784]])


You can also stops autograd from tracking history on Tensors
with requires_grad=True by wrapping the code block in
``with torch.no_grad():``


.. code-block:: python

    print(x.requires_grad)
    print((x ** 2).requires_grad)

    with torch.no_grad():
    	print((x ** 2).requires_grad)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    True
    True
    False


**Total running time of the script:** ( 0 minutes  0.009 seconds)


.. _sphx_glr_download_beginner_former_torchies_autograd_tutorial.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download

     :download:`Download Python source code: autograd_tutorial.py <autograd_tutorial.py>`


  .. container:: sphx-glr-download

     :download:`Download Jupyter notebook: autograd_tutorial.ipynb <autograd_tutorial.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.readthedocs.io>`_