RandomCrop¶
- class torchvision.transforms.v2.RandomCrop(size: Union[int, Sequence[int]], padding: Optional[Union[int, Sequence[int]]] = None, pad_if_needed: bool = False, fill: Union[int, float, Sequence[int], Sequence[float], None, Dict[Type, Optional[Union[int, float, Sequence[int], Sequence[float]]]]] = 0, padding_mode: Literal['constant', 'edge', 'reflect', 'symmetric'] = 'constant')[source]¶
[BETA] Crop the input at a random location.
Warning
The RandomCrop transform is in Beta stage, and while we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes.
If the input is a
torch.Tensor
or aDatapoint
(e.g.Image
,Video
,BoundingBox
etc.) it can have arbitrary number of leading batch dimensions. For example, the image can have[..., C, H, W]
shape. A bounding box can have[..., 4]
shape.- Parameters:
size (sequence or int) – Desired output size of the crop. If size is an int instead of sequence like (h, w), a square crop (size, size) is made. If provided a sequence of length 1, it will be interpreted as (size[0], size[0]).
padding (int or sequence, optional) –
Optional padding on each border of the image. Default is None. If a single int is provided this is used to pad all borders. If sequence of length 2 is provided this is the padding on left/right and top/bottom respectively. If a sequence of length 4 is provided this is the padding for the left, top, right and bottom borders respectively.
Note
In torchscript mode padding as single int is not supported, use a sequence of length 1:
[padding, ]
.pad_if_needed (boolean, optional) – It will pad the image if smaller than the desired size to avoid raising an exception. Since cropping is done after padding, the padding seems to be done at a random offset.
fill (number or tuple or dict, optional) – Pixel fill value used when the
padding_mode
is constant. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. Fill value can be also a dictionary mapping data type to the fill value, e.g.fill={datapoints.Image: 127, datapoints.Mask: 0}
whereImage
will be filled with 127 andMask
will be filled with 0.padding_mode (str, optional) –
Type of padding. Should be: constant, edge, reflect or symmetric. Default is constant.
constant: pads with a constant value, this value is specified with fill
edge: pads with the last value at the edge of the image.
reflect: pads with reflection of image without repeating the last value on the edge. For example, padding [1, 2, 3, 4] with 2 elements on both sides in reflect mode will result in [3, 2, 1, 2, 3, 4, 3, 2]
symmetric: pads with reflection of image repeating the last value on the edge. For example, padding [1, 2, 3, 4] with 2 elements on both sides in symmetric mode will result in [2, 1, 1, 2, 3, 4, 4, 3]