RandomPerspective

class torchvision.transforms.v2.RandomPerspective(distortion_scale: float = 0.5, p: float = 0.5, interpolation: Union[InterpolationMode, int] = InterpolationMode.BILINEAR, fill: Union[int, float, Sequence[int], Sequence[float], None, Dict[Union[Type, str], Optional[Union[int, float, Sequence[int], Sequence[float]]]]] = 0)[source]

[BETA] Perform a random perspective transformation of the input with a given probability.

Note

The RandomPerspective transform is in Beta stage, and while we do not expect disruptive breaking changes, some APIs may slightly change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753.

If the input is a torch.Tensor or a TVTensor (e.g. Image, Video, BoundingBoxes etc.) it can have arbitrary number of leading batch dimensions. For example, the image can have [..., C, H, W] shape. A bounding box can have [..., 4] shape.

Parameters:

distortion_scale (float, optional) – argument to control the degree of distortion and ranges from 0 to 1. Default is 0.5.
p (float, optional) – probability of the input being transformed. Default is 0.5.
interpolation (InterpolationMode, optional) – Desired interpolation enum defined by torchvision.transforms.InterpolationMode. Default is InterpolationMode.BILINEAR. If input is Tensor, only InterpolationMode.NEAREST, InterpolationMode.BILINEAR are supported. The corresponding Pillow integer constants, e.g. PIL.Image.BILINEAR are accepted as well.
fill (number or tuple or dict, optional) – Pixel fill value used when the padding_mode is constant. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. Fill value can be also a dictionary mapping data type to the fill value, e.g. fill={tv_tensors.Image: 127, tv_tensors.Mask: 0} where Image will be filled with 127 and Mask will be filled with 0.

Examples using RandomPerspective:

Illustration of transforms

static get_params(width: int, height: int, distortion_scale: float) → Tuple[List[List[int]], List[List[int]]][source]

Get parameters for perspective for a random perspective transform.

Parameters:

width (int) – width of the image.
height (int) – height of the image.
distortion_scale (float) – argument to control the degree of distortion and ranges from 0 to 1.

Returns:

List containing [top-left, top-right, bottom-right, bottom-left] of the original image, List containing [top-left, top-right, bottom-right, bottom-left] of the transformed image.

RandomPerspective

Docs

Tutorials

Resources