Snips¶
- class torchaudio.datasets.Snips(root: Union[str, Path], subset: str, speakers: Optional[List[str]] = None, audio_format: str = 'mp3')[source]¶
Snips [Coucke et al., 2018] dataset.
- Parameters:
root (str or Path) – Root directory where the dataset’s top level directory is found.
subset (str) – Subset of the dataset to use. Options: [
"train"
,"valid"
,"test"
].speakers (List[str] or None, optional) – The speaker list to include in the dataset. If
None
, include all speakers in the subset. (Default:None
)audio_format (str, optional) – The extension of the audios. Options: [
"mp3"
,"wav"
]. (Default:"mp3"
)
__getitem__¶
- Snips.__getitem__(n: int) Tuple[Tensor, int, str, str, str] [source]¶
Load the n-th sample from the dataset.
- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
- Tensor:
Waveform
- int:
Sample rate
- str:
File name
- str:
Transcription of audio
- str:
Inside–outside–beginning (IOB) label of transcription
- str:
Intention label of the audio.
- Return type:
Tuple of the following items
get_metadata¶
- Snips.get_metadata(n: int) Tuple[str, int, str, str, str] [source]¶
Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as
__getitem__()
.- Parameters:
n (int) – The index of the sample to be loaded.
- Returns:
- str:
Path to audio
- int:
Sample rate
- str:
File name
- str:
Transcription of audio
- str:
Inside–outside–beginning (IOB) label of transcription
- str:
Intention label of the audio.
- Return type:
Tuple of the following items