FiveCrop

class torchvision.transforms.v2.FiveCrop(size: Union[int, Sequence[int]])[source]

[BETA] Crop the image or video into four corners and the central crop.

Warning

The FiveCrop transform is in Beta stage, and while we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes.

If the input is a torch.Tensor or a Image or a Video it can have arbitrary number of leading batch dimensions. For example, the image can have [..., C, H, W] shape.

Note

This transform returns a tuple of images and there may be a mismatch in the number of inputs and targets your Dataset returns. See below for an example of how to deal with this.

Parameters:: size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop of size (size, size) is made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0]).

Example

>>> class BatchMultiCrop(transforms.Transform):
...     def forward(self, sample: Tuple[Tuple[Union[datapoints.Image, datapoints.Video], ...], int]):
...         images_or_videos, labels = sample
...         batch_size = len(images_or_videos)
...         image_or_video = images_or_videos[0]
...         images_or_videos = image_or_video.wrap_like(image_or_video, torch.stack(images_or_videos))
...         labels = torch.full((batch_size,), label, device=images_or_videos.device)
...         return images_or_videos, labels
...
>>> image = datapoints.Image(torch.rand(3, 256, 256))
>>> label = 3
>>> transform = transforms.Compose([transforms.FiveCrop(224), BatchMultiCrop()])
>>> images, labels = transform(image, label)
>>> images.shape
torch.Size([5, 3, 224, 224])
>>> labels
tensor([3, 3, 3, 3, 3])

FiveCrop

Docs

Tutorials

Resources