Datasets¶
Torchvision provides many built-in datasets in the torchvision.datasets
module, as well as utility classes for building your own datasets.
Built-in datasets¶
All datasets are subclasses of torch.utils.data.Dataset
i.e, they have __getitem__
and __len__
methods implemented.
Hence, they can all be passed to a torch.utils.data.DataLoader
which can load multiple samples in parallel using torch.multiprocessing
workers.
For example:
imagenet_data = torchvision.datasets.ImageNet('path/to/imagenet_root/')
data_loader = torch.utils.data.DataLoader(imagenet_data,
batch_size=4,
shuffle=True,
num_workers=args.nThreads)
All the datasets have almost similar API. They all have two common arguments:
transform
and target_transform
to transform the input and target respectively.
You can also create your own datasets using the provided base classes.
Warning
When a dataset object is created with download=True
, the files are first
downloaded and extracted in the root directory. This download logic is not
multi-process safe, so it may lead to conflicts / race conditions if it is
run within a distributed setting. In distributed mode, we recommend creating
a dummy dataset object to trigger the download logic before setting up
distributed mode.
Image classification¶
|
Caltech 101 Dataset. |
|
Caltech 256 Dataset. |
|
|
|
CIFAR10 Dataset. |
|
CIFAR100 Dataset. |
|
The Country211 Data Set from OpenAI. |
|
|
|
EMNIST Dataset. |
|
RGB version of the EuroSAT Dataset. |
|
A fake dataset that returns randomly generated images and returns them as PIL images |
|
Fashion-MNIST Dataset. |
|
FER2013 Dataset. |
|
FGVC Aircraft Dataset. |
|
Flickr8k Entities Dataset. |
|
Flickr30k Entities Dataset. |
|
Oxford 102 Flower Dataset. |
|
|
|
|
|
iNaturalist Dataset. |
|
ImageNet 2012 Classification Dataset. |
|
Imagenette image classification dataset. |
|
Kuzushiji-MNIST Dataset. |
|
LFW Dataset. |
|
LSUN dataset. |
|
MNIST Dataset. |
|
Omniglot Dataset. |
|
|
|
Places365 classification dataset. |
|
|
|
QMNIST Dataset. |
|
|
|
SEMEION Dataset. |
|
SBU Captioned Photo Dataset. |
|
Stanford Cars Dataset |
|
STL10 Dataset. |
|
|
|
SVHN Dataset. |
|
USPS Dataset. |
Image detection or segmentation¶
|
MS Coco Detection Dataset. |
|
|
|
Cityscapes Dataset. |
|
KITTI Dataset. |
|
|
|
|
|
Pascal VOC Segmentation Dataset. |
|
Pascal VOC Detection Dataset. |
|
WIDERFace Dataset. |
Optical Flow¶
|
FlyingChairs Dataset for optical flow. |
|
FlyingThings3D dataset for optical flow. |
|
HD1K dataset for optical flow. |
|
KITTI dataset for optical flow (2015). |
|
Sintel Dataset for optical flow. |
Stereo Matching¶
|
Carla simulator data linked in the CREStereo github repo. |
|
KITTI dataset from the 2012 stereo evaluation benchmark. |
|
KITTI dataset from the 2015 stereo evaluation benchmark. |
|
Synthetic dataset used in training the CREStereo architecture. |
|
FallingThings dataset. |
|
Dataset interface for Scene Flow datasets. |
|
Sintel Stereo Dataset. |
|
InStereo2k dataset. |
|
ETH3D Low-Res Two-View dataset. |
|
Publicly available scenes from the Middlebury dataset 2014 version <https://vision.middlebury.edu/stereo/data/scenes2014/>. |
Image pairs¶
|
LFW Dataset. |
|
Multi-view Stereo Correspondence Dataset. |
Video classification¶
|
HMDB51 dataset. |
|
Generic Kinetics dataset. |
|
UCF101 dataset. |
Video prediction¶
|
MovingMNIST Dataset. |
Base classes for custom datasets¶
|
A generic data loader. |
|
A generic data loader where the images are arranged in this way by default: . |
|
Base Class For making datasets which are compatible with torchvision. |
Transforms v2¶
|
Wrap a |