.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "recipes/recipes/loading_data_recipe.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_recipes_recipes_loading_data_recipe.py: Loading data in PyTorch ======================= PyTorch features extensive neural network building blocks with a simple, intuitive, and stable API. PyTorch includes packages to prepare and load common datasets for your model. Introduction ------------ At the heart of PyTorch data loading utility is the `torch.utils.data.DataLoader `__ class. It represents a Python iterable over a dataset. Libraries in PyTorch offer built-in high-quality datasets for you to use in `torch.utils.data.Dataset `__. These datasets are currently available in: * `torchvision `__ * `torchaudio `__ * `torchtext `__ with more to come. Using the ``yesno`` dataset from ``torchaudio.datasets.YESNO``, we will demonstrate how to effectively and efficiently load data from a PyTorch ``Dataset`` into a PyTorch ``DataLoader``. .. GENERATED FROM PYTHON SOURCE LINES 30-34 Setup ----- Before we begin, we need to install ``torchaudio`` to have access to the dataset. .. GENERATED FROM PYTHON SOURCE LINES 34-37 .. code-block:: default # pip install torchaudio .. GENERATED FROM PYTHON SOURCE LINES 38-39 To run in Google Colab, uncomment the following line: .. GENERATED FROM PYTHON SOURCE LINES 39-42 .. code-block:: default # !pip install torchaudio .. GENERATED FROM PYTHON SOURCE LINES 43-60 Steps ----- 1. Import all necessary libraries for loading our data 2. Access the data in the dataset 3. Loading the data 4. Iterate over the data 5. [Optional] Visualize the data 1. Import necessary libraries for loading our data --------------------------------------------------------------- For this recipe, we will use ``torch`` and ``torchaudio``. Depending on what built-in datasets you use, you can also install and import ``torchvision`` or ``torchtext``. .. GENERATED FROM PYTHON SOURCE LINES 60-65 .. code-block:: default import torch import torchaudio .. GENERATED FROM PYTHON SOURCE LINES 66-74 2. Access the data in the dataset --------------------------------------------------------------- The ``yesno`` dataset in ``torchaudio`` features sixty recordings of one individual saying yes or no in Hebrew; with each recording being eight words long (`read more here `__). ``torchaudio.datasets.YESNO`` creates a dataset for ``yesno``. .. GENERATED FROM PYTHON SOURCE LINES 74-80 .. code-block:: default torchaudio.datasets.YESNO( root='./', url='http://www.openslr.org/resources/1/waves_yesno.tar.gz', folder_in_archive='waves_yesno', download=True) .. GENERATED FROM PYTHON SOURCE LINES 81-88 Each item in the dataset is a tuple of the form: (waveform, sample_rate, labels). You must set a ``root`` for the ``yesno`` dataset, which is where the training and testing dataset will exist. The other parameters are optional, with their default values shown. Here is some additional useful info on the other parameters: .. GENERATED FROM PYTHON SOURCE LINES 88-104 .. code-block:: default # * ``download``: If true, downloads the dataset from the internet and puts it in root directory. If dataset is already downloaded, it is not downloaded again. # # Let’s access our ``yesno`` data: # # A data point in ``yesno`` is a tuple (waveform, sample_rate, labels) where labels # is a list of integers with 1 for yes and 0 for no. yesno_data = torchaudio.datasets.YESNO('./', download=True) # Pick data point number 3 to see an example of the the ``yesno_data``: n = 3 waveform, sample_rate, labels = yesno_data[n] print("Waveform: {}\nSample rate: {}\nLabels: {}".format(waveform, sample_rate, labels)) .. GENERATED FROM PYTHON SOURCE LINES 105-116 When using this data in practice, it is best practice to provision the data into a “training” dataset and a “testing” dataset. This ensures that you have out-of-sample data to test the performance of your model. 3. Loading the data --------------------------------------------------------------- Now that we have access to the dataset, we must pass it through ``torch.utils.data.DataLoader``. The ``DataLoader`` combines the dataset and a sampler, returning an iterable over the dataset. .. GENERATED FROM PYTHON SOURCE LINES 116-122 .. code-block:: default data_loader = torch.utils.data.DataLoader(yesno_data, batch_size=1, shuffle=True) .. GENERATED FROM PYTHON SOURCE LINES 123-131 4. Iterate over the data --------------------------------------------------------------- Our data is now iterable using the ``data_loader``. This will be necessary when we begin training our model! You will notice that now each data entry in the ``data_loader`` object is converted to a tensor containing tensors representing our waveform, sample rate, and labels. .. GENERATED FROM PYTHON SOURCE LINES 131-138 .. code-block:: default for data in data_loader: print("Data: ", data) print("Waveform: {}\nSample rate: {}\nLabels: {}".format(data[0], data[1], data[2])) break .. GENERATED FROM PYTHON SOURCE LINES 139-145 5. [Optional] Visualize the data --------------------------------------------------------------- You can optionally visualize your data to further understand the output from your ``DataLoader``. .. GENERATED FROM PYTHON SOURCE LINES 145-154 .. code-block:: default import matplotlib.pyplot as plt print(data[0][0].numpy()) plt.figure() plt.plot(waveform.t().numpy()) .. GENERATED FROM PYTHON SOURCE LINES 155-164 Congratulations! You have successfully loaded data in PyTorch. Learn More ---------- Take a look at these other recipes to continue your learning: - `Defining a Neural Network `__ - `What is a state_dict in PyTorch `__ .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.000 seconds) .. _sphx_glr_download_recipes_recipes_loading_data_recipe.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: loading_data_recipe.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: loading_data_recipe.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_