Shortcuts

LFCC

class torchaudio.transforms.LFCC(sample_rate: int = 16000, n_filter: int = 128, f_min: float = 0.0, f_max: Optional[float] = None, n_lfcc: int = 40, dct_type: int = 2, norm: str = 'ortho', log_lf: bool = False, speckwargs: Optional[dict] = None)[source]

Create the linear-frequency cepstrum coefficients from an audio signal.

This feature supports the following devices: CPU, CUDA This API supports the following properties: Autograd, TorchScript

By default, this calculates the LFCC on the DB-scaled linear filtered spectrogram. This is not the textbook implementation, but is implemented here to give consistency with librosa.

This output depends on the maximum value in the input spectrogram, and so may return different values for an audio clip split into snippets vs. a a full clip.

Parameters:
  • sample_rate (int, optional) – Sample rate of audio signal. (Default: 16000)

  • n_filter (int, optional) – Number of linear filters to apply. (Default: 128)

  • n_lfcc (int, optional) – Number of lfc coefficients to retain. (Default: 40)

  • f_min (float, optional) – Minimum frequency. (Default: 0.)

  • f_max (float or None, optional) – Maximum frequency. (Default: None)

  • dct_type (int, optional) – type of DCT (discrete cosine transform) to use. (Default: 2)

  • norm (str, optional) – norm to use. (Default: "ortho")

  • log_lf (bool, optional) – whether to use log-lf spectrograms instead of db-scaled. (Default: False)

  • speckwargs (dict or None, optional) – arguments for Spectrogram. (Default: None)

Example
>>> waveform, sample_rate = torchaudio.load("test.wav", normalize=True)
>>> transform = transforms.LFCC(
>>>     sample_rate=sample_rate,
>>>     n_lfcc=13,
>>>     speckwargs={"n_fft": 400, "hop_length": 160, "center": False},
>>> )
>>> lfcc = transform(waveform)

See also

torchaudio.functional.linear_fbanks() - The function used to generate the filter banks.

Tutorials using LFCC:
Audio Feature Extractions

Audio Feature Extractions

Audio Feature Extractions
forward(waveform: Tensor) Tensor[source]
Parameters:

waveform (Tensor) – Tensor of audio of dimension (…, time).

Returns:

Linear Frequency Cepstral Coefficients of size (…, n_lfcc, time).

Return type:

Tensor

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources