TimeStretch

class torchaudio.transforms.TimeStretch(hop_length: Optional[int] = None, n_freq: int = 201, fixed_rate: Optional[float] = None)[source]

Stretch stft in time without modifying pitch for a given rate.

Proposed in SpecAugment [Park et al., 2019].

Parameters:

hop_length (int or None, optional) – Length of hop between STFT windows. (Default: n_fft // 2, where n_fft == (n_freq - 1) * 2)
n_freq (int, optional) – number of filter banks from stft. (Default: 201)
fixed_rate (float or None, optional) – rate to speed up or slow down by. If None is provided, rate must be passed to the forward method. (Default: None)

Note

The expected input is raw, complex-valued spectrogram.

Example

>>> spectrogram = torchaudio.transforms.Spectrogram(power=None)
>>> stretch = torchaudio.transforms.TimeStretch()
>>>
>>> original = spectrogram(waveform)
>>> stretched_1_2 = stretch(original, 1.2)
>>> stretched_0_9 = stretch(original, 0.9)

The visualization of stretched spectrograms.

Tutorials using TimeStretch:

Audio Feature Augmentation

forward(complex_specgrams: Tensor, overriding_rate: Optional[float] = None) → Tensor[source]

Parameters:

complex_specgrams (Tensor) – A tensor of dimension (…, freq, num_frame) with complex dtype.
overriding_rate (float or None, optional) – speed up to apply to this batch. If no rate is passed, use self.fixed_rate. (Default: None)

Returns:

Stretched spectrogram. The resulting tensor is of the corresponding complex dtype as the input spectrogram, and the number of frames is changed to ceil(num_frame / rate).

Return type:

Tensor

TimeStretch

Docs

Tutorials

Resources