Shortcuts

Wav2Letter

class torchaudio.models.Wav2Letter(num_classes: int = 40, input_type: str = 'waveform', num_features: int = 1)[source]

Wav2Letter model architecture from Wav2Letter: an End-to-End ConvNet-based Speech Recognition System [Collobert et al., 2016].

See also

Parameters:
  • num_classes (int, optional) – Number of classes to be classified. (Default: 40)

  • input_type (str, optional) – Wav2Letter can use as input: waveform, power_spectrum or mfcc (Default: waveform).

  • num_features (int, optional) – Number of input features that the network will receive (Default: 1).

Methods

forward

Wav2Letter.forward(x: Tensor) Tensor[source]
Parameters:

x (torch.Tensor) – Tensor of dimension (batch_size, num_features, input_length).

Returns:

Predictor tensor of dimension (batch_size, number_of_classes, input_length).

Return type:

Tensor

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources