Torchvision provides many built-in datasets in the torchvision.datasets module, as well as utility classes for building your own datasets.

Built-in datasets

All datasets are subclasses of i.e, they have __getitem__ and __len__ methods implemented. Hence, they can all be passed to a which can load multiple samples in parallel using torch.multiprocessing workers. For example:

imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/')
data_loader =,

All the datasets have almost similar API. They all have two common arguments: transform and target_transform to transform the input and target respectively. You can also create your own datasets using the provided base classes.

Image classification

Caltech101(root, target_type, str] =, …)

Caltech 101 Dataset.

Caltech256(root, transform, …)

Caltech 256 Dataset.

CelebA(root, split, target_type, str] =, …)

Large-scale CelebFaces Attributes (CelebA) Dataset Dataset.

CIFAR10(root, train, transform, …)

CIFAR10 Dataset.

CIFAR100(root, train, transform, …)

CIFAR100 Dataset.

Country211(root, split, transform, …)

The Country211 Data Set from OpenAI.

DTD(root, split, partition, transform, …)

Describable Textures Dataset (DTD).

EMNIST(root, split, **kwargs)

EMNIST Dataset.

EuroSAT(root, transform, target_transform, …)

RGB version of the EuroSAT Dataset.

FakeData(size, image_size, int, int] =, …)

A fake dataset that returns randomly generated images and returns them as PIL images

FashionMNIST(root, train, transform, …)

Fashion-MNIST Dataset.

FER2013(root, split, transform, target_transform)

FER2013 Dataset.

FGVCAircraft(root, split, annotation_level, …)

FGVC Aircraft Dataset.

Flickr8k(root, ann_file, transform, …)

Flickr8k Entities Dataset.

Flickr30k(root, ann_file, transform, …)

Flickr30k Entities Dataset.

Flowers102(root, split, transform, …)

Oxford 102 Flower Dataset.

Food101(root, split, transform, …)

The Food-101 Data Set.

GTSRB(root, split, transform, …)

German Traffic Sign Recognition Benchmark (GTSRB) Dataset.

INaturalist(root, version, target_type, …)

iNaturalist Dataset.

ImageNet(root, split, **kwargs)

ImageNet 2012 Classification Dataset.

KMNIST(root, train, transform, …)

Kuzushiji-MNIST Dataset.

LFWPeople(root, split, image_set, transform, …)

LFW Dataset.

LSUN(root, classes, List[str]] =, transform, …)

LSUN dataset.

MNIST(root, train, transform, …)

MNIST Dataset.

Omniglot(root, background, transform, …)

Omniglot Dataset.

OxfordIIITPet(root, split, target_types, …)

Oxford-IIIT Pet Dataset.

Places365(root, split, small, download, …)

Places365 classification dataset.

PCAM(root, split, transform, …)

PCAM Dataset.

QMNIST(root, what, compat, train, **kwargs)

QMNIST Dataset.

RenderedSST2(root, split, transform, …)

The Rendered SST2 Dataset.

SEMEION(root, transform, target_transform, …)

SEMEION Dataset.

SBU(root, transform, target_transform, download)

SBU Captioned Photo Dataset.

StanfordCars(root, split, transform, …)

Stanford Cars Dataset

STL10(root, split, folds, transform, …)

STL10 Dataset.

SUN397(root, transform, target_transform, …)

The SUN397 Data Set.

SVHN(root, split, transform, …)

SVHN Dataset.

USPS(root, train, transform, …)

USPS Dataset.

Image detection or segmentation

CocoDetection(root, annFile, transform, …)

MS Coco Detection Dataset.

CelebA(root, split, target_type, str] =, …)

Large-scale CelebFaces Attributes (CelebA) Dataset Dataset.

Cityscapes(root, split, mode, target_type, …)

Cityscapes Dataset.

GTSRB(root, split, transform, …)

German Traffic Sign Recognition Benchmark (GTSRB) Dataset.

Kitti(root, train, transform, …)

KITTI Dataset.

OxfordIIITPet(root, split, target_types, …)

Oxford-IIIT Pet Dataset.

SBDataset(root, image_set, mode, download, …)

Semantic Boundaries Dataset

VOCSegmentation(root, year, image_set, …)

Pascal VOC Segmentation Dataset.

VOCDetection(root, year, image_set, …)

Pascal VOC Detection Dataset.

WIDERFace(root, split, transform, …)

WIDERFace Dataset.

Optical Flow

FlyingChairs(root[, split, transforms])

FlyingChairs Dataset for optical flow.

FlyingThings3D(root[, split, pass_name, …])

FlyingThings3D dataset for optical flow.

HD1K(root[, split, transforms])

HD1K dataset for optical flow.

KittiFlow(root[, split, transforms])

KITTI dataset for optical flow (2015).

Sintel(root[, split, pass_name, transforms])

Sintel Dataset for optical flow.

Image pairs

LFWPairs(root, split, image_set, transform, …)

LFW Dataset.

PhotoTour(root, name, train, transform, download)

Multi-view Stereo Correspondence Dataset.

Image captioning

CocoCaptions(root, annFile, transform, …)

MS Coco Captions Dataset.

Video classification

HMDB51(root, annotation_path, …)

HMDB51 dataset.

Kinetics(root, frames_per_clip, num_classes, …)

Generic Kinetics dataset.

Kinetics400(root, frames_per_clip, …)

Kinetics-400 dataset.

UCF101(root, annotation_path, …)

UCF101 dataset.

Base classes for custom datasets

DatasetFolder(root, loader, Any], …)

A generic data loader.

ImageFolder(root, transform, …)

A generic data loader where the images are arranged in this way by default: .

VisionDataset(root, transforms, transform, …)

Base Class For making datasets which are compatible with torchvision.


Access comprehensive developer documentation for PyTorch

View Docs


Get in-depth tutorials for beginners and advanced developers

View Tutorials


Find development resources and get your questions answered

View Resources