FluentSpeechCommands¶
- class torchaudio.datasets.FluentSpeechCommands(root: Union[str, Path], subset: str = 'train')[source]¶
Fluent Speech Commands [Lugosch et al., 2019] dataset
- Parameters:
root (str of Path) – Path to the directory where the dataset is found.
subset (str, optional) – subset of the dataset to use. Options: [
"train"
,"valid"
,"test"
]. (Default:"train"
)
__getitem__¶
- FluentSpeechCommands.__getitem__(n: int) Tuple[Tensor, int, str, int, str, str, str, str] [source]¶
Load the n-th sample from the dataset.
- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
Tuple of the following items;
- Tensor:
Waveform
- int:
Sample rate
- str:
File name
- int:
Speaker ID
- str:
Transcription
- str:
Action
- str:
Object
- str:
Location
get_metadata¶
- FluentSpeechCommands.get_metadata(n: int) Tuple[str, int, str, int, str, str, str, str] [source]¶
Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as
__getitem__()
.- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
Tuple of the following items;
- str:
Path to audio
- int:
Sample rate
- str:
File name
- int:
Speaker ID
- str:
Transcription
- str:
Action
- str:
Object
- str:
Location