.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/audio_data_augmentation_tutorial.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_audio_data_augmentation_tutorial.py: Audio Data Augmentation ======================= **Author**: `Moto Hira `__ ``torchaudio`` provides a variety of ways to augment audio data. In this tutorial, we look into a way to apply effects, filters, RIR (room impulse response) and codecs. At the end, we synthesize noisy speech over phone from clean speech. .. GENERATED FROM PYTHON SOURCE LINES 15-23 .. code-block:: default import torch import torchaudio import torchaudio.functional as F print(torch.__version__) print(torchaudio.__version__) .. rst-class:: sphx-glr-script-out .. code-block:: none 1.13.0 0.13.0 .. GENERATED FROM PYTHON SOURCE LINES 24-29 Preparation ----------- First, we import the modules and download the audio assets we use in this tutorial. .. GENERATED FROM PYTHON SOURCE LINES 29-43 .. code-block:: default import math from IPython.display import Audio import matplotlib.pyplot as plt from torchaudio.utils import download_asset SAMPLE_WAV = download_asset("tutorial-assets/steam-train-whistle-daniel_simon.wav") SAMPLE_RIR = download_asset("tutorial-assets/Lab41-SRI-VOiCES-rm1-impulse-mc01-stu-clo-8000hz.wav") SAMPLE_SPEECH = download_asset("tutorial-assets/Lab41-SRI-VOiCES-src-sp0307-ch127535-sg0042-8000hz.wav") SAMPLE_NOISE = download_asset("tutorial-assets/Lab41-SRI-VOiCES-rm1-babb-mc01-stu-clo-8000hz.wav") .. rst-class:: sphx-glr-script-out .. code-block:: none 0%| | 0.00/427k [00:00`__. **Tip** If you need to load and resample your audio data on the fly, then you can use :py:func:`torchaudio.sox_effects.apply_effects_file` with effect ``"rate"``. **Note** :py:func:`torchaudio.sox_effects.apply_effects_file` accepts a file-like object or path-like object. Similar to :py:func:`torchaudio.load`, when the audio format cannot be inferred from either the file extension or header, you can provide argument ``format`` to specify the format of the audio source. **Note** This process is not differentiable. .. GENERATED FROM PYTHON SOURCE LINES 78-98 .. code-block:: default # Load the data waveform1, sample_rate1 = torchaudio.load(SAMPLE_WAV) # Define effects effects = [ ["lowpass", "-1", "300"], # apply single-pole lowpass filter ["speed", "0.8"], # reduce the speed # This only changes sample rate, so it is necessary to # add `rate` effect with original sample rate after this. ["rate", f"{sample_rate1}"], ["reverb", "-w"], # Reverbration gives some dramatic feeling ] # Apply effects waveform2, sample_rate2 = torchaudio.sox_effects.apply_effects_tensor(waveform1, sample_rate1, effects) print(waveform1.shape, sample_rate1) print(waveform2.shape, sample_rate2) .. rst-class:: sphx-glr-script-out .. code-block:: none torch.Size([2, 109368]) 44100 torch.Size([2, 136710]) 44100 .. GENERATED FROM PYTHON SOURCE LINES 99-103 Note that the number of frames and number of channels are different from those of the original after the effects are applied. Let’s listen to the audio. .. GENERATED FROM PYTHON SOURCE LINES 103-123 .. code-block:: default def plot_waveform(waveform, sample_rate, title="Waveform", xlim=None): waveform = waveform.numpy() num_channels, num_frames = waveform.shape time_axis = torch.arange(0, num_frames) / sample_rate figure, axes = plt.subplots(num_channels, 1) if num_channels == 1: axes = [axes] for c in range(num_channels): axes[c].plot(time_axis, waveform[c], linewidth=1) axes[c].grid(True) if num_channels > 1: axes[c].set_ylabel(f"Channel {c+1}") if xlim: axes[c].set_xlim(xlim) figure.suptitle(title) plt.show(block=False) .. GENERATED FROM PYTHON SOURCE LINES 125-143 .. code-block:: default def plot_specgram(waveform, sample_rate, title="Spectrogram", xlim=None): waveform = waveform.numpy() num_channels, _ = waveform.shape figure, axes = plt.subplots(num_channels, 1) if num_channels == 1: axes = [axes] for c in range(num_channels): axes[c].specgram(waveform[c], Fs=sample_rate) if num_channels > 1: axes[c].set_ylabel(f"Channel {c+1}") if xlim: axes[c].set_xlim(xlim) figure.suptitle(title) plt.show(block=False) .. GENERATED FROM PYTHON SOURCE LINES 144-147 Original: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 147-152 .. code-block:: default plot_waveform(waveform1, sample_rate1, title="Original", xlim=(-0.1, 3.2)) plot_specgram(waveform1, sample_rate1, title="Original", xlim=(0, 3.04)) Audio(waveform1, rate=sample_rate1) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_001.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_001.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_002.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_002.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 153-156 Effects applied: ~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 156-161 .. code-block:: default plot_waveform(waveform2, sample_rate2, title="Effects Applied", xlim=(-0.1, 3.2)) plot_specgram(waveform2, sample_rate2, title="Effects Applied", xlim=(0, 3.04)) Audio(waveform2, rate=sample_rate2) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_003.png :alt: Effects Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_003.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_004.png :alt: Effects Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_004.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 162-164 Doesn’t it sound more dramatic? .. GENERATED FROM PYTHON SOURCE LINES 166-181 Simulating room reverberation ----------------------------- `Convolution reverb `__ is a technique that's used to make clean audio sound as though it has been produced in a different environment. Using Room Impulse Response (RIR), for instance, we can make clean speech sound as though it has been uttered in a conference room. For this process, we need RIR data. The following data are from the VOiCES dataset, but you can record your own — just turn on your microphone and clap your hands. .. GENERATED FROM PYTHON SOURCE LINES 181-187 .. code-block:: default rir_raw, sample_rate = torchaudio.load(SAMPLE_RIR) plot_waveform(rir_raw, sample_rate, title="Room Impulse Response (raw)") plot_specgram(rir_raw, sample_rate, title="Room Impulse Response (raw)") Audio(rir_raw, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_005.png :alt: Room Impulse Response (raw) :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_005.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_006.png :alt: Room Impulse Response (raw) :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_006.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 188-191 First, we need to clean up the RIR. We extract the main impulse, normalize the signal power, then flip along the time axis. .. GENERATED FROM PYTHON SOURCE LINES 191-198 .. code-block:: default rir = rir_raw[:, int(sample_rate * 1.01) : int(sample_rate * 1.3)] rir = rir / torch.norm(rir, p=2) RIR = torch.flip(rir, [1]) plot_waveform(rir, sample_rate, title="Room Impulse Response") .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_007.png :alt: Room Impulse Response :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_007.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 199-201 Then, we convolve the speech signal with the RIR filter. .. GENERATED FROM PYTHON SOURCE LINES 201-207 .. code-block:: default speech, _ = torchaudio.load(SAMPLE_SPEECH) speech_ = torch.nn.functional.pad(speech, (RIR.shape[1] - 1, 0)) augmented = torch.nn.functional.conv1d(speech_[None, ...], RIR[None, ...])[0] .. GENERATED FROM PYTHON SOURCE LINES 208-211 Original: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 211-216 .. code-block:: default plot_waveform(speech, sample_rate, title="Original") plot_specgram(speech, sample_rate, title="Original") Audio(speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_008.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_008.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_009.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_009.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 217-220 RIR applied: ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 220-226 .. code-block:: default plot_waveform(augmented, sample_rate, title="RIR Applied") plot_specgram(augmented, sample_rate, title="RIR Applied") Audio(augmented, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_010.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_010.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_011.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_011.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 227-239 Adding background noise ----------------------- To add background noise to audio data, you can simply add a noise Tensor to the Tensor representing the audio data. A common method to adjust the intensity of noise is changing the Signal-to-Noise Ratio (SNR). [`wikipedia `__] $$ \\mathrm{SNR} = \\frac{P_{signal}}{P_{noise}} $$ $$ \\mathrm{SNR_{dB}} = 10 \\log _{{10}} \\mathrm {SNR} $$ .. GENERATED FROM PYTHON SOURCE LINES 239-254 .. code-block:: default speech, _ = torchaudio.load(SAMPLE_SPEECH) noise, _ = torchaudio.load(SAMPLE_NOISE) noise = noise[:, : speech.shape[1]] speech_rms = speech.norm(p=2) noise_rms = noise.norm(p=2) snr_dbs = [20, 10, 3] noisy_speeches = [] for snr_db in snr_dbs: snr = 10 ** (snr_db / 20) scale = snr * noise_rms / speech_rms noisy_speeches.append((scale * speech + noise) / 2) .. GENERATED FROM PYTHON SOURCE LINES 255-258 Background noise: ~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 258-263 .. code-block:: default plot_waveform(noise, sample_rate, title="Background noise") plot_specgram(noise, sample_rate, title="Background noise") Audio(noise, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_012.png :alt: Background noise :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_012.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_013.png :alt: Background noise :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_013.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 264-267 SNR 20 dB: ~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 267-273 .. code-block:: default snr_db, noisy_speech = snr_dbs[0], noisy_speeches[0] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_014.png :alt: SNR: 20 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_014.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_015.png :alt: SNR: 20 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_015.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 274-277 SNR 10 dB: ~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 277-283 .. code-block:: default snr_db, noisy_speech = snr_dbs[1], noisy_speeches[1] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_016.png :alt: SNR: 10 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_016.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_017.png :alt: SNR: 10 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_017.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 284-287 SNR 3 dB: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 287-294 .. code-block:: default snr_db, noisy_speech = snr_dbs[2], noisy_speeches[2] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_018.png :alt: SNR: 3 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_018.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_019.png :alt: SNR: 3 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_019.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 295-303 Applying codec to Tensor object ------------------------------- :py:func:`torchaudio.functional.apply_codec` can apply codecs to a Tensor object. **Note** This process is not differentiable. .. GENERATED FROM PYTHON SOURCE LINES 303-317 .. code-block:: default waveform, sample_rate = torchaudio.load(SAMPLE_SPEECH) configs = [ {"format": "wav", "encoding": "ULAW", "bits_per_sample": 8}, {"format": "gsm"}, {"format": "vorbis", "compression": -1}, ] waveforms = [] for param in configs: augmented = F.apply_codec(waveform, sample_rate, **param) waveforms.append(augmented) .. GENERATED FROM PYTHON SOURCE LINES 318-321 Original: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 321-326 .. code-block:: default plot_waveform(waveform, sample_rate, title="Original") plot_specgram(waveform, sample_rate, title="Original") Audio(waveform, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_020.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_020.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_021.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_021.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 327-330 8 bit mu-law: ~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 330-335 .. code-block:: default plot_waveform(waveforms[0], sample_rate, title="8 bit mu-law") plot_specgram(waveforms[0], sample_rate, title="8 bit mu-law") Audio(waveforms[0], rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_022.png :alt: 8 bit mu-law :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_022.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_023.png :alt: 8 bit mu-law :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_023.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 336-339 GSM-FR: ~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 339-344 .. code-block:: default plot_waveform(waveforms[1], sample_rate, title="GSM-FR") plot_specgram(waveforms[1], sample_rate, title="GSM-FR") Audio(waveforms[1], rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_024.png :alt: GSM-FR :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_024.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_025.png :alt: GSM-FR :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_025.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 345-348 Vorbis: ~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 348-353 .. code-block:: default plot_waveform(waveforms[2], sample_rate, title="Vorbis") plot_specgram(waveforms[2], sample_rate, title="Vorbis") Audio(waveforms[2], rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_026.png :alt: Vorbis :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_026.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_027.png :alt: Vorbis :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_027.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 354-361 Simulating a phone recoding --------------------------- Combining the previous techniques, we can simulate audio that sounds like a person talking over a phone in a echoey room with people talking in the background. .. GENERATED FROM PYTHON SOURCE LINES 361-412 .. code-block:: default sample_rate = 16000 original_speech, sample_rate = torchaudio.load(SAMPLE_SPEECH) plot_specgram(original_speech, sample_rate, title="Original") # Apply RIR speech_ = torch.nn.functional.pad(original_speech, (RIR.shape[1] - 1, 0)) rir_applied = torch.nn.functional.conv1d(speech_[None, ...], RIR[None, ...])[0] plot_specgram(rir_applied, sample_rate, title="RIR Applied") # Add background noise # Because the noise is recorded in the actual environment, we consider that # the noise contains the acoustic feature of the environment. Therefore, we add # the noise after RIR application. noise, _ = torchaudio.load(SAMPLE_NOISE) noise = noise[:, : rir_applied.shape[1]] snr_db = 8 scale = (10 ** (snr_db / 20)) * noise.norm(p=2) / rir_applied.norm(p=2) bg_added = (scale * rir_applied + noise) / 2 plot_specgram(bg_added, sample_rate, title="BG noise added") # Apply filtering and change sample rate filtered, sample_rate2 = torchaudio.sox_effects.apply_effects_tensor( bg_added, sample_rate, effects=[ ["lowpass", "4000"], [ "compand", "0.02,0.05", "-60,-60,-30,-10,-20,-8,-5,-8,-2,-8", "-8", "-7", "0.05", ], ["rate", "8000"], ], ) plot_specgram(filtered, sample_rate2, title="Filtered") # Apply telephony codec codec_applied = F.apply_codec(filtered, sample_rate2, format="gsm") plot_specgram(codec_applied, sample_rate2, title="GSM Codec Applied") .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_028.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_028.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_029.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_029.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_030.png :alt: BG noise added :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_030.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_031.png :alt: Filtered :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_031.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_032.png :alt: GSM Codec Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_032.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 413-416 Original speech: ~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 416-419 .. code-block:: default Audio(original_speech, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 420-423 RIR applied: ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 423-426 .. code-block:: default Audio(rir_applied, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 427-430 Background noise added: ~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 430-433 .. code-block:: default Audio(bg_added, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 434-437 Filtered: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 437-440 .. code-block:: default Audio(filtered, rate=sample_rate2) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 441-444 Codec applied: ~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 444-446 .. code-block:: default Audio(codec_applied, rate=sample_rate2) .. raw:: html


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 13.907 seconds) .. _sphx_glr_download_tutorials_audio_data_augmentation_tutorial.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: audio_data_augmentation_tutorial.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: audio_data_augmentation_tutorial.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_