torchvision¶
This library is part of the PyTorch project. PyTorch is an open source machine learning framework.
Features described in this documentation are classified by release status:
Stable: These features will be maintained long-term and there should generally be no major performance limitations or gaps in documentation. We also expect to maintain backwards compatibility (although breaking changes can happen and notice will be given one release ahead of time).
Beta: Features are tagged as Beta because the API may change based on user feedback, because the performance needs to improve, or because coverage across operators is not yet complete. For Beta features, we are committing to seeing the feature through to the Stable classification. We are not, however, committing to backwards compatibility.
Prototype: These features are typically not available as part of binary distributions like PyPI or Conda, except sometimes behind run-time flags, and are at an early stage for feedback and testing.
The torchvision
package consists of popular datasets, model
architectures, and common image transformations for computer vision.
- Transforming and augmenting images
- Models and pre-trained weights
- Datasets
- Utils
- Operators
- batched_nms
- box_area
- box_convert
- box_iou
- clip_boxes_to_image
- deform_conv2d
- generalized_box_iou
- generalized_box_iou_loss
- masks_to_boxes
- nms
- ps_roi_align
- ps_roi_pool
- remove_small_boxes
- roi_align
- roi_pool
- sigmoid_focal_loss
- stochastic_depth
- RoIAlign
- PSRoIAlign
- RoIPool
- PSRoIPool
- DeformConv2d
- MultiScaleRoIAlign
- FeaturePyramidNetwork
- StochasticDepth
- FrozenBatchNorm2d
- SqueezeExcitation
- Reading/Writing images and videos
- Feature extraction for model inspection
-
torchvision.
get_video_backend
()[source]¶ Returns the currently active video backend used to decode videos.
- Returns
Name of the video backend. one of {‘pyav’, ‘video_reader’}.
- Return type
-
torchvision.
set_image_backend
(backend)[source]¶ Specifies the package used to load images.
- Parameters
backend (string) – Name of the image backend. one of {‘PIL’, ‘accimage’}. The
accimage
package uses the Intel IPP library. It is generally faster than PIL, but does not support as many operations.
-
torchvision.
set_video_backend
(backend)[source]¶ Specifies the package used to decode videos.
- Parameters
backend (string) – Name of the video backend. one of {‘pyav’, ‘video_reader’}. The
pyav
package uses the 3rd party PyAv library. It is a Pythonic binding for the FFmpeg libraries. Thevideo_reader
package includes a native C++ implementation on top of FFMPEG libraries, and a python API of TorchScript custom operator. It generally decodes faster thanpyav
, but is perhaps less robust.
Note
Building with FFMPEG is disabled by default in the latest main. If you want to use the ‘video_reader’ backend, please compile torchvision from source.