Shortcuts

RandomCrop

class torchvision.transforms.v2.RandomCrop(size: Union[int, Sequence[int]], padding: Optional[Union[int, Sequence[int]]] = None, pad_if_needed: bool = False, fill: Union[int, float, Sequence[int], Sequence[float], None, Dict[Type, Optional[Union[int, float, Sequence[int], Sequence[float]]]]] = 0, padding_mode: Literal['constant', 'edge', 'reflect', 'symmetric'] = 'constant')[source]

[BETA] Crop the input at a random location.

Warning

The RandomCrop transform is in Beta stage, and while we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes.

If the input is a torch.Tensor or a Datapoint (e.g. Image, Video, BoundingBox etc.) it can have arbitrary number of leading batch dimensions. For example, the image can have [..., C, H, W] shape. A bounding box can have [..., 4] shape.

Parameters:
  • size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0]).

  • padding (int or sequence, optional) –

    Optional padding on each border of the image. Default is None. If a single int is provided this is used to pad all borders. If sequence of length 2 is provided this is the padding on left/right and top/bottom respectively. If a sequence of length 4 is provided this is the padding for the left, top, right and bottom borders respectively.

    Note

    In torchscript mode padding as single int is not supported, use a sequence of length 1: [padding, ].

  • pad_if_needed (boolean, optional) – It will pad the image if smaller than the desired size to avoid raising an exception. Since cropping is done after padding, the padding seems to be done at a random offset.

  • fill (number or tuple or dict, optional) – Pixel fill value used when the padding_mode is constant. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. Fill value can be also a dictionary mapping data type to the fill value, e.g. fill={datapoints.Image: 127, datapoints.Mask: 0} where Image will be filled with 127 and Mask will be filled with 0.

  • padding_mode (str, optional) –

    Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant.

    • constant: pads with a constant value, this value is specified with fill

    • edge: pads with the last value at the edge of the image.

    • reflect: pads with reflection of image without repeating the last value on the edge. For example, padding [1, 2, 3, 4] with 2 elements on both sides in reflect mode will result in [3, 2, 1, 2, 3, 4, 3, 2]

    • symmetric: pads with reflection of image repeating the last value on the edge. For example, padding [1, 2, 3, 4] with 2 elements on both sides in symmetric mode will result in [2, 1, 1, 2, 3, 4, 4, 3]

static get_params(img: Tensor, output_size: Tuple[int, int]) Tuple[int, int, int, int][source]

Get parameters for crop for a random crop.

Parameters:
  • img (PIL Image or Tensor) – Image to be cropped.

  • output_size (tuple) – Expected output size of the crop.

Returns:

params (i, j, h, w) to be passed to crop for random crop.

Return type:

tuple

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources