Transforms are common image transforms. They can be chained together using Compose

class torchvision.transforms.Compose(transforms)[source]

Composes several transforms together.

Parameters:transforms (list of Transform objects) – list of transforms to compose.


>>> transforms.Compose([
>>>     transforms.CenterCrop(10),
>>>     transforms.ToTensor(),
>>> ])

Transforms on PIL.Image

class torchvision.transforms.Scale(size, interpolation=2)[source]

Rescale the input PIL.Image to the given size.

  • size (sequence or int) – Desired output size. If size is a sequence like (w, h), output size will be matched to this. If size is an int, smaller edge of the image will be matched to this number. i.e, if height > width, then image will be rescaled to (size * height / width, size)
  • interpolation (int, optional) – Desired interpolation. Default is PIL.Image.BILINEAR
class torchvision.transforms.CenterCrop(size)[source]

Crops the given PIL.Image at the center.

Parameters:size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made.
class torchvision.transforms.RandomCrop(size, padding=0)[source]

Crop the given PIL.Image at a random location.

  • size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made.
  • padding (int or sequence, optional) – Optional padding on each border of the image. Default is 0, i.e no padding. If a sequence of length 4 is provided, it is used to pad left, top, right, bottom borders respectively.
class torchvision.transforms.RandomHorizontalFlip[source]

Horizontally flip the given PIL.Image randomly with a probability of 0.5.

class torchvision.transforms.RandomSizedCrop(size, interpolation=2)[source]

Crop the given PIL.Image to random size and aspect ratio.

A crop of random size of (0.08 to 1.0) of the original size and a random aspect ratio of 3/4 to 4/3 of the original aspect ratio is made. This crop is finally resized to given size. This is popularly used to train the Inception networks.

  • size – size of the smaller edge
  • interpolation – Default: PIL.Image.BILINEAR
class torchvision.transforms.Pad(padding, fill=0)[source]

Pad the given PIL.Image on all sides with the given “pad” value.

  • padding (int or sequence) – Padding on each border. If a sequence of length 4, it is used to pad left, top, right and bottom borders respectively.
  • fill – Pixel fill value. Default is 0.

Transforms on torch.*Tensor

class torchvision.transforms.Normalize(mean, std)[source]

Normalize an tensor image with mean and standard deviation.

Given mean: (R, G, B) and std: (R, G, B), will normalize each channel of the torch.*Tensor, i.e. channel = (channel - mean) / std

  • mean (sequence) – Sequence of means for R, G, B channels respecitvely.
  • std (sequence) – Sequence of standard deviations for R, G, B channels respecitvely.
Parameters:tensor (Tensor) – Tensor image of size (C, H, W) to be normalized.
Returns:Normalized image.
Return type:Tensor

Conversion Transforms

class torchvision.transforms.ToTensor[source]

Convert a PIL.Image or numpy.ndarray to tensor.

Converts a PIL.Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].

Parameters:pic (PIL.Image or numpy.ndarray) – Image to be converted to tensor.
Returns:Converted image.
Return type:Tensor
class torchvision.transforms.ToPILImage[source]

Convert a tensor to PIL Image.

Converts a torch.*Tensor of shape C x H x W or a numpy ndarray of shape H x W x C to a PIL.Image while preserving the value range.

Parameters:pic (Tensor or numpy.ndarray) – Image to be converted to PIL.Image.
Returns:Image converted to PIL.Image.
Return type:PIL.Image

Generic Transforms

class torchvision.transforms.Lambda(lambd)[source]

Apply a user-defined lambda as a transform.

Parameters:lambd (function) – Lambda/function to be used for transform.