Kinetics400

class torchvision.datasets.Kinetics400(root: str, frames_per_clip: int, num_classes: Optional[Any] = None, split: Optional[Any] = None, download: Optional[Any] = None, num_download_workers: Optional[Any] = None, **kwargs: Any)[source]

Kinetics-400 dataset.

Warning

This class was deprecated in 0.12 and will be removed in 0.14. Please use Kinetics(..., num_classes='400') instead.

Kinetics-400 is an action recognition video dataset. This dataset consider every video as a collection of video clips of fixed size, specified by frames_per_clip, where the step in frames between each clip is given by step_between_clips.

To give an example, for 2 videos with 10 and 15 frames respectively, if frames_per_clip=5 and step_between_clips=5, the dataset size will be (2 + 3) = 5, where the first two elements will come from video 1, and the next three elements from video 2. Note that we drop clips which do not have exactly frames_per_clip elements, so not all frames in a video might be present.

Internally, it uses a VideoClips object to handle clip creation.

Parameters

root (string) –

Root directory of the Kinetics-400 Dataset. Should be structured as follows:

root/
├── class1
│   ├── clip1.avi
│   ├── clip2.avi
│   ├── clip3.mp4
│   └── ...
└── class2
    ├── clipx.avi
    └── ...

frames_per_clip (int) – number of frames in a clip
step_between_clips (int) – number of frames between each clip
transform (callable, optional) – A function/transform that takes in a TxHxWxC video and returns a transformed version.

Returns

A 3-tuple with the following entries:

video (Tensor[T, H, W, C]): the T video frames

audio(Tensor[K, L]): the audio frames, where K is the number of channels and L is the number of points

label (int): class of the video clip

Return type

tuple

Special-members

__getitem__(idx: int) → Tuple[torch.Tensor, torch.Tensor, int]

Parameters: index (int) – Index
Returns: Sample and meta data, optionally transformed by the respective transforms.
Return type: (Any)

Kinetics400

Docs

Tutorials

Resources