Shortcuts

SPEECHCOMMANDS

class torchaudio.datasets.SPEECHCOMMANDS(root: Union[str, Path], url: str = 'speech_commands_v0.02', folder_in_archive: str = 'SpeechCommands', download: bool = False, subset: Optional[str] = None)[source]

Speech Commands [Warden, 2018] dataset.

Parameters:
  • root (str or Path) – Path to the directory where the dataset is found or downloaded.

  • url (str, optional) – The URL to download the dataset from, or the type of the dataset to dowload. Allowed type values are "speech_commands_v0.01" and "speech_commands_v0.02" (default: "speech_commands_v0.02")

  • folder_in_archive (str, optional) – The top-level directory of the dataset. (default: "SpeechCommands")

  • download (bool, optional) – Whether to download the dataset if it is not found at root path. (default: False).

  • subset (str or None, optional) – Select a subset of the dataset [None, “training”, “validation”, “testing”]. None means the whole dataset. “validation” and “testing” are defined in “validation_list.txt” and “testing_list.txt”, respectively, and “training” is the rest. Details for the files “validation_list.txt” and “testing_list.txt” are explained in the README of the dataset and in the introduction of Section 7 of the original paper and its reference 12. The original paper can be found here. (Default: None)

__getitem__

SPEECHCOMMANDS.__getitem__(n: int) Tuple[Tensor, int, str, str, int][source]

Load the n-th sample from the dataset.

Parameters:

n (int) – The index of the sample to be loaded

Returns:

Tuple of the following items;

Tensor:

Waveform

int:

Sample rate

str:

Label

str:

Speaker ID

int:

Utterance number

get_metadata

SPEECHCOMMANDS.get_metadata(n: int) Tuple[str, int, str, str, int][source]

Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as __getitem__().

Parameters:

n (int) – The index of the sample to be loaded

Returns:

Tuple of the following items;

str:

Path to the audio

int:

Sample rate

str:

Label

str:

Speaker ID

int:

Utterance number

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources