RandomPerspective

class torchvision.transforms.v2.RandomPerspective(distortion_scale: float = 0.5, p: float = 0.5, interpolation: Union[InterpolationMode, int] = InterpolationMode.BILINEAR, fill: Union[int, float, Sequence[int], Sequence[float], None, Dict[Type, Optional[Union[int, float, Sequence[int], Sequence[float]]]]] = 0)[source]

[BETA] Perform a random perspective transformation of the input with a given probability.

Warning

The RandomPerspective transform is in Beta stage, and while we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes.

If the input is a torch.Tensor or a Datapoint (e.g. Image, Video, BoundingBox etc.) it can have arbitrary number of leading batch dimensions. For example, the image can have [..., C, H, W] shape. A bounding box can have [..., 4] shape.

Parameters:

distortion_scale (float, optional) – argument to control the degree of distortion and ranges from 0 to 1. Default is 0.5.
p (float, optional) – probability of the input being transformed. Default is 0.5.
interpolation (InterpolationMode, optional) – Desired interpolation enum defined by torchvision.transforms.InterpolationMode. Default is InterpolationMode.BILINEAR. If input is Tensor, only InterpolationMode.NEAREST, InterpolationMode.BILINEAR are supported. The corresponding Pillow integer constants, e.g. PIL.Image.BILINEAR are accepted as well.
fill (number or tuple or dict, optional) – Pixel fill value used when the padding_mode is constant. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. Fill value can be also a dictionary mapping data type to the fill value, e.g. fill={datapoints.Image: 127, datapoints.Mask: 0} where Image will be filled with 127 and Mask will be filled with 0.

static get_params(width: int, height: int, distortion_scale: float) → Tuple[List[List[int]], List[List[int]]][source]

Get parameters for perspective for a random perspective transform.

Parameters:

width (int) – width of the image.
height (int) – height of the image.
distortion_scale (float) – argument to control the degree of distortion and ranges from 0 to 1.

Returns:

List containing [top-left, top-right, bottom-right, bottom-left] of the original image, List containing [top-left, top-right, bottom-right, bottom-left] of the transformed image.

RandomPerspective

Docs

Tutorials

Resources