Operators

torchvision.ops implements operators that are specific for Computer Vision.

Note

All operators have native support for TorchScript.

`batched_nms`(boxes, scores, idxs, iou_threshold)	Performs non-maximum suppression in a batched fashion.
`box_area`(boxes)	Computes the area of a set of bounding boxes, which are specified by their (x1, y1, x2, y2) coordinates.
`box_convert`(boxes, in_fmt, out_fmt)	Converts boxes from given in_fmt to out_fmt.
`box_iou`(boxes1, boxes2)	Return intersection-over-union (Jaccard index) between two sets of boxes.
`clip_boxes_to_image`(boxes, size)	Clip boxes so that they lie inside an image of size size.
`deform_conv2d`(input, offset, weight[, bias, …])	Performs Deformable Convolution v2, described in Deformable ConvNets v2: More Deformable, Better Results if `mask` is not `None` and Performs Deformable Convolution, described in Deformable Convolutional Networks if `mask` is `None`.
`generalized_box_iou`(boxes1, boxes2)	Return generalized intersection-over-union (Jaccard index) between two sets of boxes.
`generalized_box_iou_loss`(boxes1, boxes2[, …])	Original implementation from https://github.com/facebookresearch/fvcore/blob/bfff2ef/fvcore/nn/giou_loss.py
`masks_to_boxes`(masks)	Compute the bounding boxes around the provided masks.
`nms`(boxes, scores, iou_threshold)	Performs non-maximum suppression (NMS) on the boxes according to their intersection-over-union (IoU).
`ps_roi_align`(input, boxes, output_size[, …])	Performs Position-Sensitive Region of Interest (RoI) Align operator mentioned in Light-Head R-CNN.
`ps_roi_pool`(input, boxes, output_size[, …])	Performs Position-Sensitive Region of Interest (RoI) Pool operator described in R-FCN
`remove_small_boxes`(boxes, min_size)	Remove boxes which contains at least one side smaller than min_size.
`roi_align`(input, boxes, output_size[, …])	Performs Region of Interest (RoI) Align operator with average pooling, as described in Mask R-CNN.
`roi_pool`(input, boxes, output_size[, …])	Performs Region of Interest (RoI) Pool operator described in Fast R-CNN
`sigmoid_focal_loss`(inputs, targets[, alpha, …])	Original implementation from https://github.com/facebookresearch/fvcore/blob/master/fvcore/nn/focal_loss.py .
`stochastic_depth`(input, p, mode[, training])	Implements the Stochastic Depth from “Deep Networks with Stochastic Depth” used for randomly dropping residual branches of residual architectures.

`RoIAlign`(output_size, spatial_scale, …)	See `roi_align()`.
`PSRoIAlign`(output_size, spatial_scale, …)	See `ps_roi_align()`.
`RoIPool`(output_size, spatial_scale)	See `roi_pool()`.
`PSRoIPool`(output_size, spatial_scale)	See `ps_roi_pool()`.
`DeformConv2d`(in_channels, out_channels, …)	See `deform_conv2d()`.
`MultiScaleRoIAlign`(featmap_names, …)	Multi-scale RoIAlign pooling, which is useful for detection with or without FPN.
`FeaturePyramidNetwork`(in_channels_list, …)	Module that adds a FPN from on top of a set of feature maps.
`StochasticDepth`(p, mode)	See `stochastic_depth()`.
`FrozenBatchNorm2d`(num_features, eps)	BatchNorm2d where the batch statistics and the affine parameters are fixed
`SqueezeExcitation`(input_channels, …)	This block implements the Squeeze-and-Excitation block from https://arxiv.org/abs/1709.01507 (see Fig.

Operators

Docs

Tutorials

Resources