ElasticTransform¶
- class torchvision.transforms.v2.ElasticTransform(alpha: Union[float, Sequence[float]] = 50.0, sigma: Union[float, Sequence[float]] = 5.0, interpolation: Union[InterpolationMode, int] = InterpolationMode.BILINEAR, fill: Union[int, float, Sequence[int], Sequence[float], None, Dict[Union[Type, str], Optional[Union[int, float, Sequence[int], Sequence[float]]]]] = 0)[source]¶
Transform the input with elastic transformations.
If the input is a
torch.Tensor
or aTVTensor
(e.g.Image
,Video
,BoundingBoxes
etc.) it can have arbitrary number of leading batch dimensions. For example, the image can have[..., C, H, W]
shape. A bounding box can have[..., 4]
shape.Given alpha and sigma, it will generate displacement vectors for all pixels based on random offsets. Alpha controls the strength and sigma controls the smoothness of the displacements. The displacements are added to an identity grid and the resulting grid is used to transform the input.
Note
Implementation to transform bounding boxes is approximative (not exact). We construct an approximation of the inverse grid as
inverse_grid = identity - displacement
. This is not an exact inverse of the grid used to transform images, i.e.grid = identity + displacement
. Our assumption is thatdisplacement * displacement
is small and can be ignored. Large displacements would lead to large errors in the approximation.- Applications:
Randomly transforms the morphology of objects in images and produces a see-through-water-like effect.
- Parameters:
alpha (float or sequence of python:floats, optional) – Magnitude of displacements. Default is 50.0.
sigma (float or sequence of python:floats, optional) – Smoothness of displacements. Default is 5.0.
interpolation (InterpolationMode, optional) – Desired interpolation enum defined by
torchvision.transforms.InterpolationMode
. Default isInterpolationMode.BILINEAR
. If input is Tensor, onlyInterpolationMode.NEAREST
,InterpolationMode.BILINEAR
are supported. The corresponding Pillow integer constants, e.g.PIL.Image.BILINEAR
are accepted as well.fill (number or tuple or dict, optional) – Pixel fill value used when the
padding_mode
is constant. Default is 0. If a tuple of length 3, it is used to fill R, G, B channels respectively. Fill value can be also a dictionary mapping data type to the fill value, e.g.fill={tv_tensors.Image: 127, tv_tensors.Mask: 0}
whereImage
will be filled with 127 andMask
will be filled with 0.
Examples using
ElasticTransform
:Illustration of transforms