ElasticTransform

class torchvision.transforms.v2.ElasticTransform(alpha: Union[float, Sequence[float]] = 50.0, sigma: Union[float, Sequence[float]] = 5.0, interpolation: Union[InterpolationMode, int] = InterpolationMode.BILINEAR, fill: Union[int, float, Sequence[int], Sequence[float], None, Dict[Type, Optional[Union[int, float, Sequence[int], Sequence[float]]]]] = 0)[source]

[BETA] Transform the input with elastic transformations.

Warning

The RandomPerspective transform is in Beta stage, and while we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes.

If the input is a torch.Tensor or a Datapoint (e.g. Image, Video, BoundingBox etc.) it can have arbitrary number of leading batch dimensions. For example, the image can have [..., C, H, W] shape. A bounding box can have [..., 4] shape.

Given alpha and sigma, it will generate displacement vectors for all pixels based on random offsets. Alpha controls the strength and sigma controls the smoothness of the displacements. The displacements are added to an identity grid and the resulting grid is used to transform the input.

Note

Implementation to transform bounding boxes is approximative (not exact). We construct an approximation of the inverse grid as inverse_grid = idenity - displacement. This is not an exact inverse of the grid used to transform images, i.e. grid = identity + displacement. Our assumption is that displacement * displacement is small and can be ignored. Large displacements would lead to large errors in the approximation.

Applications:: Randomly transforms the morphology of objects in images and produces a see-through-water-like effect.

Parameters:

alpha (float or sequence of python:floats, optional) – Magnitude of displacements. Default is 50.0.
sigma (float or sequence of python:floats, optional) – Smoothness of displacements. Default is 5.0.
interpolation (InterpolationMode, optional) – Desired interpolation enum defined by torchvision.transforms.InterpolationMode. Default is InterpolationMode.BILINEAR. If input is Tensor, only InterpolationMode.NEAREST, InterpolationMode.BILINEAR are supported. The corresponding Pillow integer constants, e.g. PIL.Image.BILINEAR are accepted as well.
fill (number or tuple or dict, optional) – Pixel fill value used when the padding_mode is constant. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. Fill value can be also a dictionary mapping data type to the fill value, e.g. fill={datapoints.Image: 127, datapoints.Mask: 0} where Image will be filled with 127 and Mask will be filled with 0.

ElasticTransform

Docs

Tutorials

Resources