RandomResizedCrop

class torchvision.transforms.v2.RandomResizedCrop(size: Union[int, Sequence[int]], scale: Tuple[float, float] = (0.08, 1.0), ratio: Tuple[float, float] = (0.75, 1.3333333333333333), interpolation: Union[InterpolationMode, int] = InterpolationMode.BILINEAR, antialias: Optional[Union[str, bool]] = 'warn')[source]

[BETA] Crop a random portion of the input and resize it to a given size.

Warning

The RandomResizedCrop transform is in Beta stage, and while we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes.

If the input is a torch.Tensor or a Datapoint (e.g. Image, Video, BoundingBox etc.) it can have arbitrary number of leading batch dimensions. For example, the image can have [..., C, H, W] shape. A bounding box can have [..., 4] shape.

A crop of the original input is made: the crop has a random area (H * W) and a random aspect ratio. This crop is finally resized to the given size. This is popularly used to train the Inception networks.

Parameters:

size (int or sequence) –
expected output size of the crop, for each edge. If size is an int instead of sequence like (h, w), a square output size (size, size) is made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0]).

Note

In torchscript mode size as single int is not supported, use a sequence of length 1: [size, ].
scale (tuple of python:float, optional) – Specifies the lower and upper bounds for the random area of the crop, before resizing. The scale is defined with respect to the area of the original image.
ratio (tuple of python:float, optional) – lower and upper bounds for the random aspect ratio of the crop, before resizing.
interpolation (InterpolationMode, optional) – Desired interpolation enum defined by torchvision.transforms.InterpolationMode. Default is InterpolationMode.BILINEAR. If input is Tensor, only InterpolationMode.NEAREST, InterpolationMode.NEAREST_EXACT, InterpolationMode.BILINEAR and InterpolationMode.BICUBIC are supported. The corresponding Pillow integer constants, e.g. PIL.Image.BILINEAR are accepted as well.
antialias (bool, optional) –
Whether to apply antialiasing. It only affects tensors with bilinear or bicubic modes and it is ignored otherwise: on PIL images, antialiasing is always applied on bilinear or bicubic modes; on other modes (for PIL images and tensors), antialiasing makes no sense and this parameter is ignored. Possible values are:
- True: will apply antialiasing for bilinear or bicubic modes. Other mode aren’t affected. This is probably what you want to use.
- False: will not apply antialiasing for tensors on any mode. PIL images are still antialiased on bilinear or bicubic modes, because PIL doesn’t support no antialias.
- None: equivalent to False for tensors and True for PIL images. This value exists for legacy reasons and you probably don’t want to use it unless you really know what you are doing.
The current default is None but will change to True in v0.17 for the PIL and Tensor backends to be consistent.

static get_params(img: Tensor, scale: List[float], ratio: List[float]) → Tuple[int, int, int, int][source]

Get parameters for crop for a random sized crop.

Parameters:

img (PIL Image or Tensor) – Input image.
scale (list) – range of scale of the origin size cropped
ratio (list) – range of aspect ratio of the origin aspect ratio cropped

Returns:

params (i, j, h, w) to be passed to crop for a random sized crop.

Return type:

tuple

RandomResizedCrop

Docs

Tutorials

Resources