Operators¶
torchvision.ops
implements operators, losses and layers that are specific for Computer Vision.
Note
All operators have native support for TorchScript.
Detection and Segmentation Operators¶
The below operators perform pre-processing as well as post-processing required in object detection and segmentation models.
|
Performs non-maximum suppression in a batched fashion. |
|
Compute the bounding boxes around the provided masks. |
|
Performs non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU). |
|
Performs Region of Interest (RoI) Align operator with average pooling, as described in Mask R-CNN. |
|
Performs Region of Interest (RoI) Pool operator described in Fast R-CNN |
|
Performs Position-Sensitive Region of Interest (RoI) Align operator mentioned in Light-Head R-CNN. |
|
Performs Position-Sensitive Region of Interest (RoI) Pool operator described in R-FCN |
|
Module that adds a FPN from on top of a set of feature maps. |
|
Multi-scale RoIAlign pooling, which is useful for detection with or without FPN. |
|
See |
|
See |
|
See |
|
See |
Box Operators¶
These utility functions perform various operations on bounding boxes.
|
Computes the area of a set of bounding boxes, which are specified by their (x1, y1, x2, y2) coordinates. |
|
Converts boxes from given in_fmt to out_fmt. |
|
Return intersection-over-union (Jaccard index) between two sets of boxes. |
|
Clip boxes so that they lie inside an image of size size. |
|
Return complete intersection-over-union (Jaccard index) between two sets of boxes. |
|
Return distance intersection-over-union (Jaccard index) between two sets of boxes. |
|
Return generalized intersection-over-union (Jaccard index) between two sets of boxes. |
|
Remove boxes which contains at least one side smaller than min_size. |
Losses¶
The following vision-specific loss functions are implemented:
|
Gradient-friendly IoU loss with an additional penalty that is non-zero when the boxes do not overlap. |
|
Gradient-friendly IoU loss with an additional penalty that is non-zero when the distance between boxes’ centers isn’t zero. |
|
Gradient-friendly IoU loss with an additional penalty that is non-zero when the boxes do not overlap and scales with the size of their smallest enclosing box. |
|
Loss used in RetinaNet for dense detection: https://arxiv.org/abs/1708.02002. |
Layers¶
TorchVision provides commonly used building blocks as layers:
|
Configurable block used for Convolution2d-Normalization-Activation blocks. |
|
Configurable block used for Convolution3d-Normalization-Activation blocks. |
|
See |
|
See |
|
See |
|
BatchNorm2d where the batch statistics and the affine parameters are fixed |
|
This block implements the multi-layer perceptron (MLP) module. |
|
This module returns a view of the tensor input with its dimensions permuted. |
|
This block implements the Squeeze-and-Excitation block from https://arxiv.org/abs/1709.01507 (see Fig. |
|
See |
|
Performs Deformable Convolution v2, described in Deformable ConvNets v2: More Deformable, Better Results if |
|
Implements DropBlock2d from “DropBlock: A regularization method for convolutional networks” <https://arxiv.org/abs/1810.12890>. |
|
Implements DropBlock3d from “DropBlock: A regularization method for convolutional networks” <https://arxiv.org/abs/1810.12890>. |
|
Implements the Stochastic Depth from “Deep Networks with Stochastic Depth” used for randomly dropping residual branches of residual architectures. |