IEMOCAP¶
- class torchaudio.datasets.IEMOCAP(root: Union[str, Path], sessions: Tuple[str] = (1, 2, 3, 4, 5), utterance_type: Optional[str] = None)[source]¶
IEMOCAP [Busso et al., 2008] dataset.
- Parameters:
root (str or Path) – Root directory where the dataset’s top level directory is found
sessions (Tuple[int]) – Tuple of sessions (1-5) to use. (Default:
(1, 2, 3, 4, 5)
)utterance_type (str or None, optional) – Which type(s) of utterances to include in the dataset. Options: (“scripted”, “improvised”,
None
). IfNone
, both scripted and improvised data are used.
__getitem__¶
- IEMOCAP.__getitem__(n: int) Tuple[Tensor, int, str, str, str] [source]¶
Load the n-th sample from the dataset.
- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
Tuple of the following items;
- Tensor:
Waveform
- int:
Sample rate
- str:
File name
- str:
Label (one of
"neu"
,"hap"
,"ang"
,"sad"
,"exc"
,"fru"
)- str:
Speaker
get_metadata¶
- IEMOCAP.get_metadata(n: int) Tuple[str, int, str, str, str] [source]¶
Get metadata for the n-th sample from the dataset. Returns filepath instead of waveform, but otherwise returns the same fields as
__getitem__()
.- Parameters:
n (int) – The index of the sample to be loaded
- Returns:
Tuple of the following items;
- str:
Path to audio
- int:
Sample rate
- str:
File name
- str:
Label (one of
"neu"
,"hap"
,"ang"
,"sad"
,"exc"
,"fru"
)- str:
Speaker