.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "beginner/nlp/pytorch_tutorial.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_beginner_nlp_pytorch_tutorial.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_beginner_nlp_pytorch_tutorial.py:


Introduction to PyTorch
***********************

Introduction to Torch's tensor library
======================================

All of deep learning is computations on tensors, which are
generalizations of a matrix that can be indexed in more than 2
dimensions. We will see exactly what this means in-depth later. First,
let's look what we can do with tensors.

.. GENERATED FROM PYTHON SOURCE LINES 14-21

.. code-block:: default

    # Author: Robert Guthrie

    import torch

    torch.manual_seed(1)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none


    <torch._C.Generator object at 0x7f2fb5abf510>


.. GENERATED FROM PYTHON SOURCE LINES 22-28

Creating Tensors
~~~~~~~~~~~~~~~~

Tensors can be created from Python lists with the torch.tensor()
function.


.. GENERATED FROM PYTHON SOURCE LINES 28-46

.. code-block:: default


    # torch.tensor(data) creates a torch.Tensor object with the given data.
    V_data = [1., 2., 3.]
    V = torch.tensor(V_data)
    print(V)

    # Creates a matrix
    M_data = [[1., 2., 3.], [4., 5., 6]]
    M = torch.tensor(M_data)
    print(M)

    # Create a 3D tensor of size 2x2x2.
    T_data = [[[1., 2.], [3., 4.]],
              [[5., 6.], [7., 8.]]]
    T = torch.tensor(T_data)
    print(T)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor([1., 2., 3.])
    tensor([[1., 2., 3.],
            [4., 5., 6.]])
    tensor([[[1., 2.],
             [3., 4.]],

            [[5., 6.],
             [7., 8.]]])


.. GENERATED FROM PYTHON SOURCE LINES 47-58

What is a 3D tensor anyway? Think about it like this. If you have a
vector, indexing into the vector gives you a scalar. If you have a
matrix, indexing into the matrix gives you a vector. If you have a 3D
tensor, then indexing into the tensor gives you a matrix!

A note on terminology:
when I say "tensor" in this tutorial, it refers
to any torch.Tensor object. Matrices and vectors are special cases of
torch.Tensors, where their dimension is 2 and 1 respectively. When I am
talking about 3D tensors, I will explicitly use the term "3D tensor".


.. GENERATED FROM PYTHON SOURCE LINES 58-71

.. code-block:: default


    # Index into V and get a scalar (0 dimensional tensor)
    print(V[0])
    # Get a Python number from it
    print(V[0].item())

    # Index into M and get a vector
    print(M[0])

    # Index into T and get a matrix
    print(T[0])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor(1.)
    1.0
    tensor([1., 2., 3.])
    tensor([[1., 2.],
            [3., 4.]])


.. GENERATED FROM PYTHON SOURCE LINES 72-78

You can also create tensors of other data types. To create a tensor of integer types, try
torch.tensor([[1, 2], [3, 4]]) (where all elements in the list are integers).
You can also specify a data type by passing in ``dtype=torch.data_type``.
Check the documentation for more data types, but
Float and Long will be the most common.


.. GENERATED FROM PYTHON SOURCE LINES 81-84

You can create a tensor with random data and the supplied dimensionality
with torch.randn()


.. GENERATED FROM PYTHON SOURCE LINES 84-89

.. code-block:: default


    x = torch.randn((3, 4, 5))
    print(x)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor([[[-1.5256, -0.7502, -0.6540, -1.6095, -0.1002],
             [-0.6092, -0.9798, -1.6091, -0.7121,  0.3037],
             [-0.7773, -0.2515, -0.2223,  1.6871,  0.2284],
             [ 0.4676, -0.6970, -1.1608,  0.6995,  0.1991]],

            [[ 0.8657,  0.2444, -0.6629,  0.8073,  1.1017],
             [-0.1759, -2.2456, -1.4465,  0.0612, -0.6177],
             [-0.7981, -0.1316,  1.8793, -0.0721,  0.1578],
             [-0.7735,  0.1991,  0.0457,  0.1530, -0.4757]],

            [[-0.1110,  0.2927, -0.1578, -0.0288,  0.4533],
             [ 1.1422,  0.2486, -1.7754, -0.0255, -1.0233],
             [-0.5962, -1.0055,  0.4285,  1.4761, -1.7869],
             [ 1.6103, -0.7040, -0.1853, -0.9962, -0.8313]]])


.. GENERATED FROM PYTHON SOURCE LINES 90-94

Operations with Tensors
~~~~~~~~~~~~~~~~~~~~~~~

You can operate on tensors in the ways you would expect.

.. GENERATED FROM PYTHON SOURCE LINES 94-101

.. code-block:: default


    x = torch.tensor([1., 2., 3.])
    y = torch.tensor([4., 5., 6.])
    z = x + y
    print(z)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor([5., 7., 9.])


.. GENERATED FROM PYTHON SOURCE LINES 102-108

See `the documentation <https://pytorch.org/docs/torch.html>`__ for a
complete list of the massive number of operations available to you. They
expand beyond just mathematical operations.

One helpful operation that we will make use of later is concatenation.


.. GENERATED FROM PYTHON SOURCE LINES 108-126

.. code-block:: default


    # By default, it concatenates along the first axis (concatenates rows)
    x_1 = torch.randn(2, 5)
    y_1 = torch.randn(3, 5)
    z_1 = torch.cat([x_1, y_1])
    print(z_1)

    # Concatenate columns:
    x_2 = torch.randn(2, 3)
    y_2 = torch.randn(2, 5)
    # second arg specifies which axis to concat along
    z_2 = torch.cat([x_2, y_2], 1)
    print(z_2)

    # If your tensors are not compatible, torch will complain.  Uncomment to see the error
    # torch.cat([x_1, x_2])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor([[-0.8029,  0.2366,  0.2857,  0.6898, -0.6331],
            [ 0.8795, -0.6842,  0.4533,  0.2912, -0.8317],
            [-0.5525,  0.6355, -0.3968, -0.6571, -1.6428],
            [ 0.9803, -0.0421, -0.8206,  0.3133, -1.1352],
            [ 0.3773, -0.2824, -2.5667, -1.4303,  0.5009]])
    tensor([[ 0.5438, -0.4057,  1.1341, -0.1473,  0.6272,  1.0935,  0.0939,  1.2381],
            [-1.1115,  0.3501, -0.7703, -1.3459,  0.5119, -0.6933, -0.1668, -0.9999]])


.. GENERATED FROM PYTHON SOURCE LINES 127-135

Reshaping Tensors
~~~~~~~~~~~~~~~~~

Use the .view() method to reshape a tensor. This method receives heavy
use, because many neural network components expect their inputs to have
a certain shape. Often you will need to reshape before passing your data
to the component.


.. GENERATED FROM PYTHON SOURCE LINES 135-143

.. code-block:: default


    x = torch.randn(2, 3, 4)
    print(x)
    print(x.view(2, 12))  # Reshape to 2 rows, 12 columns
    # Same as above.  If one of the dimensions is -1, its size can be inferred
    print(x.view(2, -1))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor([[[ 0.4175, -0.2127, -0.8400, -0.4200],
             [-0.6240, -0.9773,  0.8748,  0.9873],
             [-0.0594, -2.4919,  0.2423,  0.2883]],

            [[-0.1095,  0.3126,  1.5038,  0.5038],
             [ 0.6223, -0.4481, -0.2856,  0.3880],
             [-1.1435, -0.6512, -0.1032,  0.6937]]])
    tensor([[ 0.4175, -0.2127, -0.8400, -0.4200, -0.6240, -0.9773,  0.8748,  0.9873,
             -0.0594, -2.4919,  0.2423,  0.2883],
            [-0.1095,  0.3126,  1.5038,  0.5038,  0.6223, -0.4481, -0.2856,  0.3880,
             -1.1435, -0.6512, -0.1032,  0.6937]])
    tensor([[ 0.4175, -0.2127, -0.8400, -0.4200, -0.6240, -0.9773,  0.8748,  0.9873,
             -0.0594, -2.4919,  0.2423,  0.2883],
            [-0.1095,  0.3126,  1.5038,  0.5038,  0.6223, -0.4481, -0.2856,  0.3880,
             -1.1435, -0.6512, -0.1032,  0.6937]])


.. GENERATED FROM PYTHON SOURCE LINES 144-167

Computation Graphs and Automatic Differentiation
================================================

The concept of a computation graph is essential to efficient deep
learning programming, because it allows you to not have to write the
back propagation gradients yourself. A computation graph is simply a
specification of how your data is combined to give you the output. Since
the graph totally specifies what parameters were involved with which
operations, it contains enough information to compute derivatives. This
probably sounds vague, so let's see what is going on using the
fundamental flag ``requires_grad``.

First, think from a programmers perspective. What is stored in the
torch.Tensor objects we were creating above? Obviously the data and the
shape, and maybe a few other things. But when we added two tensors
together, we got an output tensor. All this output tensor knows is its
data and shape. It has no idea that it was the sum of two other tensors
(it could have been read in from a file, it could be the result of some
other operation, etc.)

If ``requires_grad=True``, the Tensor object keeps track of how it was
created. Let's see it in action.


.. GENERATED FROM PYTHON SOURCE LINES 167-181

.. code-block:: default


    # Tensor factory methods have a ``requires_grad`` flag
    x = torch.tensor([1., 2., 3], requires_grad=True)

    # With requires_grad=True, you can still do all the operations you previously
    # could
    y = torch.tensor([4., 5., 6], requires_grad=True)
    z = x + y
    print(z)

    # BUT z knows something extra.
    print(z.grad_fn)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor([5., 7., 9.], grad_fn=<AddBackward0>)
    <AddBackward0 object at 0x7f2f990428c0>


.. GENERATED FROM PYTHON SOURCE LINES 182-189

So Tensors know what created them. z knows that it wasn't read in from
a file, it wasn't the result of a multiplication or exponential or
whatever. And if you keep following z.grad_fn, you will find yourself at
x and y.

But how does that help us compute a gradient?


.. GENERATED FROM PYTHON SOURCE LINES 189-196

.. code-block:: default


    # Let's sum up all the entries in z
    s = z.sum()
    print(s)
    print(s.grad_fn)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor(21., grad_fn=<SumBackward0>)
    <SumBackward0 object at 0x7f2f99040820>


.. GENERATED FROM PYTHON SOURCE LINES 197-222

So now, what is the derivative of this sum with respect to the first
component of x? In math, we want

.. math::

   \frac{\partial s}{\partial x_0}


Well, s knows that it was created as a sum of the tensor z. z knows
that it was the sum x + y. So

.. math::  s = \overbrace{x_0 + y_0}^\text{$z_0$} + \overbrace{x_1 + y_1}^\text{$z_1$} + \overbrace{x_2 + y_2}^\text{$z_2$}

And so s contains enough information to determine that the derivative
we want is 1!

Of course this glosses over the challenge of how to actually compute
that derivative. The point here is that s is carrying along enough
information that it is possible to compute it. In reality, the
developers of Pytorch program the sum() and + operations to know how to
compute their gradients, and run the back propagation algorithm. An
in-depth discussion of that algorithm is beyond the scope of this
tutorial.


.. GENERATED FROM PYTHON SOURCE LINES 225-230

Let's have Pytorch compute the gradient, and see that we were right:
(note if you run this block multiple times, the gradient will increment.
That is because Pytorch *accumulates* the gradient into the .grad
property, since for many models this is very convenient.)


.. GENERATED FROM PYTHON SOURCE LINES 230-236

.. code-block:: default


    # calling .backward() on any variable will run backprop, starting from it.
    s.backward()
    print(x.grad)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    tensor([1., 1., 1.])


.. GENERATED FROM PYTHON SOURCE LINES 237-240

Understanding what is going on in the block below is crucial for being a
successful programmer in deep learning.


.. GENERATED FROM PYTHON SOURCE LINES 240-271

.. code-block:: default


    x = torch.randn(2, 2)
    y = torch.randn(2, 2)
    # By default, user created Tensors have ``requires_grad=False``
    print(x.requires_grad, y.requires_grad)
    z = x + y
    # So you can't backprop through z
    print(z.grad_fn)

    # ``.requires_grad_( ... )`` changes an existing Tensor's ``requires_grad``
    # flag in-place. The input flag defaults to ``True`` if not given.
    x = x.requires_grad_()
    y = y.requires_grad_()
    # z contains enough information to compute gradients, as we saw above
    z = x + y
    print(z.grad_fn)
    # If any input to an operation has ``requires_grad=True``, so will the output
    print(z.requires_grad)

    # Now z has the computation history that relates itself to x and y
    # Can we just take its values, and **detach** it from its history?
    new_z = z.detach()

    # ... does new_z have information to backprop to x and y?
    # NO!
    print(new_z.grad_fn)
    # And how could it? ``z.detach()`` returns a tensor that shares the same storage
    # as ``z``, but with the computation history forgotten. It doesn't know anything
    # about how it was computed.
    # In essence, we have broken the Tensor away from its past history


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    False False
    None
    <AddBackward0 object at 0x7f2f99040070>
    True
    None


.. GENERATED FROM PYTHON SOURCE LINES 272-275

You can also stop autograd from tracking history on Tensors
with ``.requires_grad=True`` by wrapping the code block in
``with torch.no_grad():``

.. GENERATED FROM PYTHON SOURCE LINES 275-282

.. code-block:: default

    print(x.requires_grad)
    print((x ** 2).requires_grad)

    with torch.no_grad():
    	print((x ** 2).requires_grad)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    True
    True
    False


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  0.022 seconds)


.. _sphx_glr_download_beginner_nlp_pytorch_tutorial.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: pytorch_tutorial.py <pytorch_tutorial.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: pytorch_tutorial.ipynb <pytorch_tutorial.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_