.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/audio_data_augmentation_tutorial.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_audio_data_augmentation_tutorial.py: Audio Data Augmentation ======================= ``torchaudio`` provides a variety of ways to augment audio data. In this tutorial, we look into a way to apply effects, filters, RIR (room impulse response) and codecs. At the end, we synthesize noisy speech over phone from clean speech. .. GENERATED FROM PYTHON SOURCE LINES 13-21 .. code-block:: default import torch import torchaudio import torchaudio.functional as F print(torch.__version__) print(torchaudio.__version__) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none 1.12.0 0.12.0 .. GENERATED FROM PYTHON SOURCE LINES 22-27 Preparation ----------- First, we import the modules and download the audio assets we use in this tutorial. .. GENERATED FROM PYTHON SOURCE LINES 27-41 .. code-block:: default import math from IPython.display import Audio import matplotlib.pyplot as plt from torchaudio.utils import download_asset SAMPLE_WAV = download_asset("tutorial-assets/steam-train-whistle-daniel_simon.wav") SAMPLE_RIR = download_asset("tutorial-assets/Lab41-SRI-VOiCES-rm1-impulse-mc01-stu-clo-8000hz.wav") SAMPLE_SPEECH = download_asset("tutorial-assets/Lab41-SRI-VOiCES-src-sp0307-ch127535-sg0042-8000hz.wav") SAMPLE_NOISE = download_asset("tutorial-assets/Lab41-SRI-VOiCES-rm1-babb-mc01-stu-clo-8000hz.wav") .. rst-class:: sphx-glr-script-out Out: .. code-block:: none 0%| | 0.00/427k [00:00`__. **Tip** If you need to load and resample your audio data on the fly, then you can use :py:func:`torchaudio.sox_effects.apply_effects_file` with effect ``"rate"``. **Note** :py:func:`torchaudio.sox_effects.apply_effects_file` accepts a file-like object or path-like object. Similar to :py:func:`torchaudio.load`, when the audio format cannot be inferred from either the file extension or header, you can provide argument ``format`` to specify the format of the audio source. **Note** This process is not differentiable. .. GENERATED FROM PYTHON SOURCE LINES 76-96 .. code-block:: default # Load the data waveform1, sample_rate1 = torchaudio.load(SAMPLE_WAV) # Define effects effects = [ ["lowpass", "-1", "300"], # apply single-pole lowpass filter ["speed", "0.8"], # reduce the speed # This only changes sample rate, so it is necessary to # add `rate` effect with original sample rate after this. ["rate", f"{sample_rate1}"], ["reverb", "-w"], # Reverbration gives some dramatic feeling ] # Apply effects waveform2, sample_rate2 = torchaudio.sox_effects.apply_effects_tensor(waveform1, sample_rate1, effects) print(waveform1.shape, sample_rate1) print(waveform2.shape, sample_rate2) .. rst-class:: sphx-glr-script-out Out: .. code-block:: none torch.Size([2, 109368]) 44100 torch.Size([2, 136710]) 44100 .. GENERATED FROM PYTHON SOURCE LINES 97-101 Note that the number of frames and number of channels are different from those of the original after the effects are applied. Let’s listen to the audio. .. GENERATED FROM PYTHON SOURCE LINES 101-121 .. code-block:: default def plot_waveform(waveform, sample_rate, title="Waveform", xlim=None): waveform = waveform.numpy() num_channels, num_frames = waveform.shape time_axis = torch.arange(0, num_frames) / sample_rate figure, axes = plt.subplots(num_channels, 1) if num_channels == 1: axes = [axes] for c in range(num_channels): axes[c].plot(time_axis, waveform[c], linewidth=1) axes[c].grid(True) if num_channels > 1: axes[c].set_ylabel(f"Channel {c+1}") if xlim: axes[c].set_xlim(xlim) figure.suptitle(title) plt.show(block=False) .. GENERATED FROM PYTHON SOURCE LINES 123-141 .. code-block:: default def plot_specgram(waveform, sample_rate, title="Spectrogram", xlim=None): waveform = waveform.numpy() num_channels, _ = waveform.shape figure, axes = plt.subplots(num_channels, 1) if num_channels == 1: axes = [axes] for c in range(num_channels): axes[c].specgram(waveform[c], Fs=sample_rate) if num_channels > 1: axes[c].set_ylabel(f"Channel {c+1}") if xlim: axes[c].set_xlim(xlim) figure.suptitle(title) plt.show(block=False) .. GENERATED FROM PYTHON SOURCE LINES 142-145 Original: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 145-150 .. code-block:: default plot_waveform(waveform1, sample_rate1, title="Original", xlim=(-0.1, 3.2)) plot_specgram(waveform1, sample_rate1, title="Original", xlim=(0, 3.04)) Audio(waveform1, rate=sample_rate1) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_001.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_001.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_002.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_002.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 151-154 Effects applied: ~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 154-159 .. code-block:: default plot_waveform(waveform2, sample_rate2, title="Effects Applied", xlim=(-0.1, 3.2)) plot_specgram(waveform2, sample_rate2, title="Effects Applied", xlim=(0, 3.04)) Audio(waveform2, rate=sample_rate2) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_003.png :alt: Effects Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_003.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_004.png :alt: Effects Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_004.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 160-162 Doesn’t it sound more dramatic? .. GENERATED FROM PYTHON SOURCE LINES 164-179 Simulating room reverberation ----------------------------- `Convolution reverb `__ is a technique that's used to make clean audio sound as though it has been produced in a different environment. Using Room Impulse Response (RIR), for instance, we can make clean speech sound as though it has been uttered in a conference room. For this process, we need RIR data. The following data are from the VOiCES dataset, but you can record your own — just turn on your microphone and clap your hands. .. GENERATED FROM PYTHON SOURCE LINES 179-185 .. code-block:: default rir_raw, sample_rate = torchaudio.load(SAMPLE_RIR) plot_waveform(rir_raw, sample_rate, title="Room Impulse Response (raw)") plot_specgram(rir_raw, sample_rate, title="Room Impulse Response (raw)") Audio(rir_raw, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_005.png :alt: Room Impulse Response (raw) :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_005.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_006.png :alt: Room Impulse Response (raw) :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_006.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 186-189 First, we need to clean up the RIR. We extract the main impulse, normalize the signal power, then flip along the time axis. .. GENERATED FROM PYTHON SOURCE LINES 189-196 .. code-block:: default rir = rir_raw[:, int(sample_rate * 1.01) : int(sample_rate * 1.3)] rir = rir / torch.norm(rir, p=2) RIR = torch.flip(rir, [1]) plot_waveform(rir, sample_rate, title="Room Impulse Response") .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_007.png :alt: Room Impulse Response :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_007.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 197-199 Then, we convolve the speech signal with the RIR filter. .. GENERATED FROM PYTHON SOURCE LINES 199-205 .. code-block:: default speech, _ = torchaudio.load(SAMPLE_SPEECH) speech_ = torch.nn.functional.pad(speech, (RIR.shape[1] - 1, 0)) augmented = torch.nn.functional.conv1d(speech_[None, ...], RIR[None, ...])[0] .. GENERATED FROM PYTHON SOURCE LINES 206-209 Original: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 209-214 .. code-block:: default plot_waveform(speech, sample_rate, title="Original") plot_specgram(speech, sample_rate, title="Original") Audio(speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_008.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_008.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_009.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_009.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 215-218 RIR applied: ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 218-224 .. code-block:: default plot_waveform(augmented, sample_rate, title="RIR Applied") plot_specgram(augmented, sample_rate, title="RIR Applied") Audio(augmented, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_010.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_010.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_011.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_011.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 225-237 Adding background noise ----------------------- To add background noise to audio data, you can simply add a noise Tensor to the Tensor representing the audio data. A common method to adjust the intensity of noise is changing the Signal-to-Noise Ratio (SNR). [`wikipedia `__] $$ \\mathrm{SNR} = \\frac{P_{signal}}{P_{noise}} $$ $$ \\mathrm{SNR_{dB}} = 10 \\log _{{10}} \\mathrm {SNR} $$ .. GENERATED FROM PYTHON SOURCE LINES 237-252 .. code-block:: default speech, _ = torchaudio.load(SAMPLE_SPEECH) noise, _ = torchaudio.load(SAMPLE_NOISE) noise = noise[:, : speech.shape[1]] speech_power = speech.norm(p=2) noise_power = noise.norm(p=2) snr_dbs = [20, 10, 3] noisy_speeches = [] for snr_db in snr_dbs: snr = 10 ** (snr_db / 20) scale = snr * noise_power / speech_power noisy_speeches.append((scale * speech + noise) / 2) .. GENERATED FROM PYTHON SOURCE LINES 253-256 Background noise: ~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 256-261 .. code-block:: default plot_waveform(noise, sample_rate, title="Background noise") plot_specgram(noise, sample_rate, title="Background noise") Audio(noise, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_012.png :alt: Background noise :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_012.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_013.png :alt: Background noise :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_013.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 262-265 SNR 20 dB: ~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 265-271 .. code-block:: default snr_db, noisy_speech = snr_dbs[0], noisy_speeches[0] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_014.png :alt: SNR: 20 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_014.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_015.png :alt: SNR: 20 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_015.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 272-275 SNR 10 dB: ~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 275-281 .. code-block:: default snr_db, noisy_speech = snr_dbs[1], noisy_speeches[1] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_016.png :alt: SNR: 10 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_016.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_017.png :alt: SNR: 10 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_017.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 282-285 SNR 3 dB: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 285-292 .. code-block:: default snr_db, noisy_speech = snr_dbs[2], noisy_speeches[2] plot_waveform(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") plot_specgram(noisy_speech, sample_rate, title=f"SNR: {snr_db} [dB]") Audio(noisy_speech, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_018.png :alt: SNR: 3 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_018.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_019.png :alt: SNR: 3 [dB] :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_019.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 293-301 Applying codec to Tensor object ------------------------------- :py:func:`torchaudio.functional.apply_codec` can apply codecs to a Tensor object. **Note** This process is not differentiable. .. GENERATED FROM PYTHON SOURCE LINES 301-315 .. code-block:: default waveform, sample_rate = torchaudio.load(SAMPLE_SPEECH) configs = [ {"format": "wav", "encoding": "ULAW", "bits_per_sample": 8}, {"format": "gsm"}, {"format": "vorbis", "compression": -1}, ] waveforms = [] for param in configs: augmented = F.apply_codec(waveform, sample_rate, **param) waveforms.append(augmented) .. GENERATED FROM PYTHON SOURCE LINES 316-319 Original: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 319-324 .. code-block:: default plot_waveform(waveform, sample_rate, title="Original") plot_specgram(waveform, sample_rate, title="Original") Audio(waveform, rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_020.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_020.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_021.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_021.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 325-328 8 bit mu-law: ~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 328-333 .. code-block:: default plot_waveform(waveforms[0], sample_rate, title="8 bit mu-law") plot_specgram(waveforms[0], sample_rate, title="8 bit mu-law") Audio(waveforms[0], rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_022.png :alt: 8 bit mu-law :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_022.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_023.png :alt: 8 bit mu-law :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_023.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 334-337 GSM-FR: ~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 337-342 .. code-block:: default plot_waveform(waveforms[1], sample_rate, title="GSM-FR") plot_specgram(waveforms[1], sample_rate, title="GSM-FR") Audio(waveforms[1], rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_024.png :alt: GSM-FR :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_024.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_025.png :alt: GSM-FR :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_025.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 343-346 Vorbis: ~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 346-351 .. code-block:: default plot_waveform(waveforms[2], sample_rate, title="Vorbis") plot_specgram(waveforms[2], sample_rate, title="Vorbis") Audio(waveforms[2], rate=sample_rate) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_026.png :alt: Vorbis :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_026.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_027.png :alt: Vorbis :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_027.png :class: sphx-glr-multi-img .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 352-359 Simulating a phone recoding --------------------------- Combining the previous techniques, we can simulate audio that sounds like a person talking over a phone in a echoey room with people talking in the background. .. GENERATED FROM PYTHON SOURCE LINES 359-410 .. code-block:: default sample_rate = 16000 original_speech, sample_rate = torchaudio.load(SAMPLE_SPEECH) plot_specgram(original_speech, sample_rate, title="Original") # Apply RIR speech_ = torch.nn.functional.pad(original_speech, (RIR.shape[1] - 1, 0)) rir_applied = torch.nn.functional.conv1d(speech_[None, ...], RIR[None, ...])[0] plot_specgram(rir_applied, sample_rate, title="RIR Applied") # Add background noise # Because the noise is recorded in the actual environment, we consider that # the noise contains the acoustic feature of the environment. Therefore, we add # the noise after RIR application. noise, _ = torchaudio.load(SAMPLE_NOISE) noise = noise[:, : rir_applied.shape[1]] snr_db = 8 scale = math.exp(snr_db / 10) * noise.norm(p=2) / rir_applied.norm(p=2) bg_added = (scale * rir_applied + noise) / 2 plot_specgram(bg_added, sample_rate, title="BG noise added") # Apply filtering and change sample rate filtered, sample_rate2 = torchaudio.sox_effects.apply_effects_tensor( bg_added, sample_rate, effects=[ ["lowpass", "4000"], [ "compand", "0.02,0.05", "-60,-60,-30,-10,-20,-8,-5,-8,-2,-8", "-8", "-7", "0.05", ], ["rate", "8000"], ], ) plot_specgram(filtered, sample_rate2, title="Filtered") # Apply telephony codec codec_applied = F.apply_codec(filtered, sample_rate2, format="gsm") plot_specgram(codec_applied, sample_rate2, title="GSM Codec Applied") .. rst-class:: sphx-glr-horizontal * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_028.png :alt: Original :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_028.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_029.png :alt: RIR Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_029.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_030.png :alt: BG noise added :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_030.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_031.png :alt: Filtered :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_031.png :class: sphx-glr-multi-img * .. image-sg:: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_032.png :alt: GSM Codec Applied :srcset: /tutorials/images/sphx_glr_audio_data_augmentation_tutorial_032.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 411-414 Original speech: ~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 414-417 .. code-block:: default Audio(original_speech, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 418-421 RIR applied: ~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 421-424 .. code-block:: default Audio(rir_applied, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 425-428 Background noise added: ~~~~~~~~~~~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 428-431 .. code-block:: default Audio(bg_added, rate=sample_rate) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 432-435 Filtered: ~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 435-438 .. code-block:: default Audio(filtered, rate=sample_rate2) .. raw:: html


.. GENERATED FROM PYTHON SOURCE LINES 439-442 Codec aplied: ~~~~~~~~~~~~~ .. GENERATED FROM PYTHON SOURCE LINES 442-444 .. code-block:: default Audio(codec_applied, rate=sample_rate2) .. raw:: html


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 13.116 seconds) .. _sphx_glr_download_tutorials_audio_data_augmentation_tutorial.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: audio_data_augmentation_tutorial.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: audio_data_augmentation_tutorial.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_