.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/audio_data_augmentation_tutorial.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_audio_data_augmentation_tutorial.py: Audio Data Augmentation ======================= **Author**: `Moto Hira `__ ``torchaudio`` provides a variety of ways to augment audio data. In this tutorial, we look into a way to apply effects, filters, RIR (room impulse response) and codecs. At the end, we synthesize noisy speech over phone from clean speech. .. GENERATED FROM PYTHON SOURCE LINES 15-25 .. code-block:: default import torch import torchaudio import torchaudio.functional as F print(torch.__version__) print(torchaudio.__version__) import matplotlib.pyplot as plt .. rst-class:: sphx-glr-script-out .. code-block:: none 2.4.0.dev20240328 2.2.0.dev20240329 .. GENERATED FROM PYTHON SOURCE LINES 26-31 Preparation ----------- First, we import the modules and download the audio assets we use in this tutorial. .. GENERATED FROM PYTHON SOURCE LINES 31-42 .. code-block:: default from IPython.display import Audio from torchaudio.utils import download_asset SAMPLE_WAV = download_asset("tutorial-assets/steam-train-whistle-daniel_simon.wav") SAMPLE_RIR = download_asset("tutorial-assets/Lab41-SRI-VOiCES-rm1-impulse-mc01-stu-clo-8000hz.wav") SAMPLE_SPEECH = download_asset("tutorial-assets/Lab41-SRI-VOiCES-src-sp0307-ch127535-sg0042-8000hz.wav") SAMPLE_NOISE = download_asset("tutorial-assets/Lab41-SRI-VOiCES-rm1-babb-mc01-stu-clo-8000hz.wav") .. rst-class:: sphx-glr-script-out .. code-block:: none 0%| | 0.00/427k [00:00` explains how to use this class, so for the detail, please refer to the tutorial. .. GENERATED FROM PYTHON SOURCE LINES 53-79 .. code-block:: default # Load the data waveform1, sample_rate = torchaudio.load(SAMPLE_WAV, channels_first=False) # Define effects effect = ",".join( [ "lowpass=frequency=300:poles=1", # apply single-pole lowpass filter "atempo=0.8", # reduce the speed "aecho=in_gain=0.8:out_gain=0.9:delays=200:decays=0.3|delays=400:decays=0.3" # Applying echo gives some dramatic feeling ], ) # Apply effects def apply_effect(waveform, sample_rate, effect): effector = torchaudio.io.AudioEffector(effect=effect) return effector.apply(waveform, sample_rate) waveform2 = apply_effect(waveform1, sample_rate, effect) print(waveform1.shape, sample_rate) print(waveform2.shape, sample_rate) .. rst-class:: sphx-glr-script-out .. code-block:: none torch.Size([109368, 2]) 44100 torch.Size([144642, 2]) 44100 .. GENERATED FROM PYTHON SOURCE LINES 80-84 Note that the number of frames and number of channels are different from those of the original after the effects are applied. Let’s listen to the audio. .. GENERATED FROM PYTHON SOURCE LINES 84-105 .. code-block:: default def plot_waveform(waveform, sample_rate, title="Waveform", xlim=None): waveform = waveform.numpy() num_channels, num_frames = waveform.shape time_axis = torch.arange(0, num_frames) / sample_rate figure, axes = plt.subplots(num_channels, 1) if num_channels == 1: axes = [axes] for c in range(num_channels): axes[c].plot(time_axis, waveform[c], linewidth=1) axes[c].grid(True) if num_channels > 1: axes[c].set_ylabel(f"Channel {c+1}") if xlim: axes[c].set_xlim(xlim) figure.suptitle(title) .. GENERATED FROM PYTHON SOURCE LINES 107-126 .. code-block:: default def plot_specgram(waveform, sample_rate, title="Spectrogram", xlim=None): waveform = waveform.numpy() num_channels, _ = waveform.shape figure, axes = plt.subplots(num_channels, 1) if num_channels == 1: axes = [axes] for c in range(num_channels): axes[c].specgram(waveform[c], Fs=sample_rate) if num_channels > 1: axes[c].set_ylabel(f"Channel {c+1}") if xlim: axes[c].set_xlim(xlim) figure.suptitle(title) .. GENERATED FROM PYTHON SOURCE LINES 127-130 Original ~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 130-135 .. code-block:: default plot_waveform(waveform1.T, sample_rate, title="Original", xlim=(-0.1, 3.2)) plot_specgram(waveform1.T, sample_rate, title="Original", xlim=(0, 3.04)) Audio(waveform1.T, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_001.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_001.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_002.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_002.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 136-139 Effects applied ~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 139-145 .. code-block:: default plot_waveform(waveform2.T, sample_rate, title="Effects Applied", xlim=(-0.1, 3.2)) plot_specgram(waveform2.T, sample_rate, title="Effects Applied", xlim=(0, 3.04)) Audio(waveform2.T, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_003.png :alt: Effects Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_003.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_004.png :alt: Effects Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_004.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 146-161 Simulating room reverberation ----------------------------- `Convolution reverb `__ is a technique that's used to make clean audio sound as though it has been produced in a different environment. Using Room Impulse Response (RIR), for instance, we can make clean speech sound as though it has been uttered in a conference room. For this process, we need RIR data. The following data are from the VOiCES dataset, but you can record your own — just turn on your microphone and clap your hands. .. GENERATED FROM PYTHON SOURCE LINES 161-167 .. code-block:: default rir_raw, sample_rate = torchaudio.load(SAMPLE_RIR) plot_waveform(rir_raw, sample_rate, title="Room Impulse Response (raw)") plot_specgram(rir_raw, sample_rate, title="Room Impulse Response (raw)") Audio(rir_raw, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_005.png :alt: Room Impulse Response (raw) :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_005.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_006.png :alt: Room Impulse Response (raw) :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_006.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 168-171 First, we need to clean up the RIR. We extract the main impulse and normalize it by its power. .. GENERATED FROM PYTHON SOURCE LINES 171-177 .. code-block:: default rir = rir_raw[:, int(sample_rate * 1.01) : int(sample_rate * 1.3)] rir = rir / torch.linalg.vector_norm(rir, ord=2) plot_waveform(rir, sample_rate, title="Room Impulse Response") .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_007.png :alt: Room Impulse Response :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_007.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 178-181 Then, using :py:func:`torchaudio.functional.fftconvolve`, we convolve the speech signal with the RIR. .. GENERATED FROM PYTHON SOURCE LINES 181-185 .. code-block:: default speech, _ = torchaudio.load(SAMPLE_SPEECH) augmented = F.fftconvolve(speech, rir) .. GENERATED FROM PYTHON SOURCE LINES 186-189 Original ~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 189-194 .. code-block:: default plot_waveform(speech, sample_rate, title="Original") plot_specgram(speech, sample_rate, title="Original") Audio(speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_008.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_008.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_009.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_009.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 195-198 RIR applied ~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 198-204 .. code-block:: default plot_waveform(augmented, sample_rate, title="RIR Applied") plot_specgram(augmented, sample_rate, title="RIR Applied") Audio(augmented, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_010.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_010.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_011.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_011.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 205-221 Adding background noise ----------------------- To introduce background noise to audio data, we can add a noise Tensor to the Tensor representing the audio data according to some desired signal-to-noise ratio (SNR) [`wikipedia `__], which determines the intensity of the audio data relative to that of the noise in the output. $$ \\mathrm{SNR} = \\frac{P_{signal}}{P_{noise}} $$ $$ \\mathrm{SNR_{dB}} = 10 \\log _{{10}} \\mathrm {SNR} $$ To add noise to audio data per SNRs, we use :py:func:`torchaudio.functional.add_noise`. .. GENERATED FROM PYTHON SOURCE LINES 221-230 .. code-block:: default speech, _ = torchaudio.load(SAMPLE_SPEECH) noise, _ = torchaudio.load(SAMPLE_NOISE) noise = noise[:, : speech.shape[1]] snr_dbs = torch.tensor([20, 10, 3]) noisy_speeches = F.add_noise(speech, noise, snr_dbs) .. GENERATED FROM PYTHON SOURCE LINES 231-234 Background noise ~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 234-239 .. code-block:: default plot_waveform(noise, sample_rate, title="Background noise") plot_specgram(noise, sample_rate, title="Background noise") Audio(noise, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_012.png :alt: Background noise :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_012.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_013.png :alt: Background noise :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_013.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 240-243 SNR 20 dB ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 243-249 .. code-block:: default snr_db, noisy_speech = snr_dbs[0], noisy_speeches[0:1] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_014.png :alt: SNR: 20 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_014.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_015.png :alt: SNR: 20 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_015.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 250-253 SNR 10 dB ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 253-259 .. code-block:: default snr_db, noisy_speech = snr_dbs[1], noisy_speeches[1:2] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_016.png :alt: SNR: 10 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_016.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_017.png :alt: SNR: 10 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_017.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 260-263 SNR 3 dB ~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 263-270 .. code-block:: default snr_db, noisy_speech = snr_dbs[2], noisy_speeches[2:3] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_018.png :alt: SNR: 3 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_018.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_019.png :alt: SNR: 3 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_019.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 271-277 Applying codec to Tensor object ------------------------------- :py:class:`torchaudio.io.AudioEffector` can also apply codecs to a Tensor object. .. GENERATED FROM PYTHON SOURCE LINES 277-286 .. code-block:: default waveform, sample_rate = torchaudio.load(SAMPLE_SPEECH, channels_first=False) def apply_codec(waveform, sample_rate, format, encoder=None): encoder = torchaudio.io.AudioEffector(format=format, encoder=encoder) return encoder.apply(waveform, sample_rate) .. GENERATED FROM PYTHON SOURCE LINES 287-290 Original ~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 290-295 .. code-block:: default plot_waveform(waveform.T, sample_rate, title="Original") plot_specgram(waveform.T, sample_rate, title="Original") Audio(waveform.T, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_020.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_020.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_021.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_021.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 296-299 8 bit mu-law ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 299-305 .. code-block:: default mulaw = apply_codec(waveform, sample_rate, "wav", encoder="pcm_mulaw") plot_waveform(mulaw.T, sample_rate, title="8 bit mu-law") plot_specgram(mulaw.T, sample_rate, title="8 bit mu-law") Audio(mulaw.T, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_022.png :alt: 8 bit mu-law :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_022.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_023.png :alt: 8 bit mu-law :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_023.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 306-309 G.722 ~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 309-315 .. code-block:: default g722 = apply_codec(waveform, sample_rate, "g722") plot_waveform(g722.T, sample_rate, title="G.722") plot_specgram(g722.T, sample_rate, title="G.722") Audio(g722.T, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_024.png :alt: G.722 :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_024.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_025.png :alt: G.722 :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_025.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 316-319 Vorbis ~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 319-325 .. code-block:: default vorbis = apply_codec(waveform, sample_rate, "ogg", encoder="vorbis") plot_waveform(vorbis.T, sample_rate, title="Vorbis") plot_specgram(vorbis.T, sample_rate, title="Vorbis") Audio(vorbis.T, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_026.png :alt: Vorbis :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_026.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_027.png :alt: Vorbis :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_027.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 326-333 Simulating a phone recoding --------------------------- Combining the previous techniques, we can simulate audio that sounds like a person talking over a phone in a echoey room with people talking in the background. .. GENERATED FROM PYTHON SOURCE LINES 333-374 .. code-block:: default sample_rate = 16000 original_speech, sample_rate = torchaudio.load(SAMPLE_SPEECH) plot_specgram(original_speech, sample_rate, title="Original") # Apply RIR rir_applied = F.fftconvolve(speech, rir) plot_specgram(rir_applied, sample_rate, title="RIR Applied") # Add background noise # Because the noise is recorded in the actual environment, we consider that # the noise contains the acoustic feature of the environment. Therefore, we add # the noise after RIR application. noise, _ = torchaudio.load(SAMPLE_NOISE) noise = noise[:, : rir_applied.shape[1]] snr_db = torch.tensor([8]) bg_added = F.add_noise(rir_applied, noise, snr_db) plot_specgram(bg_added, sample_rate, title="BG noise added") # Apply filtering and change sample rate effect = ",".join( [ "lowpass=frequency=4000:poles=1", "compand=attacks=0.02:decays=0.05:points=-60/-60|-30/-10|-20/-8|-5/-8|-2/-8:gain=-8:volume=-7:delay=0.05", ] ) filtered = apply_effect(bg_added.T, sample_rate, effect) sample_rate2 = 8000 plot_specgram(filtered.T, sample_rate2, title="Filtered") # Apply telephony codec codec_applied = apply_codec(filtered, sample_rate2, "g722") plot_specgram(codec_applied.T, sample_rate2, title="G.722 Codec Applied") .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_028.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_028.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_029.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_029.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_030.png :alt: BG noise added :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_030.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_031.png :alt: Filtered :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_031.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_032.png :alt: G.722 Codec Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_032.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 375-378 Original speech ~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 378-381 .. code-block:: default Audio(original_speech, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 382-385 RIR applied ~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 385-388 .. code-block:: default Audio(rir_applied, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 389-392 Background noise added ~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 392-395 .. code-block:: default Audio(bg_added, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 396-399 Filtered ~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 399-402 .. code-block:: default Audio(filtered.T, rate=sample_rate2) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 403-406 Codec applied ~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 406-408 .. code-block:: default Audio(codec_applied.T, rate=sample_rate2) .. raw:: html


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 14.577 seconds) .. _sphx_glr_download_tutorials_audio_data_augmentation_tutorial.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: audio_data_augmentation_tutorial.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: audio_data_augmentation_tutorial.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_