Shortcuts

retinanet_resnet50_fpn

torchvision.models.detection.retinanet_resnet50_fpn(*, weights: Optional[RetinaNet_ResNet50_FPN_Weights] = None, progress: bool = True, num_classes: Optional[int] = None, weights_backbone: Optional[ResNet50_Weights] = ResNet50_Weights.IMAGENET1K_V1, trainable_backbone_layers: Optional[int] = None, **kwargs: Any) RetinaNet[source]

Constructs a RetinaNet model with a ResNet-50-FPN backbone.

Warning

The detection module is in Beta stage, and backward compatibility is not guaranteed.

Reference: Focal Loss for Dense Object Detection.

The input to the model is expected to be a list of tensors, each of shape [C, H, W], one for each image, and should be in 0-1 range. Different images can have different sizes.

The behavior of the model changes depending on if it is in training or evaluation mode.

During training, the model expects both the input tensors and targets (list of dictionary), containing:

  • boxes (FloatTensor[N, 4]): the ground-truth boxes in [x1, y1, x2, y2] format, with 0 <= x1 < x2 <= W and 0 <= y1 < y2 <= H.

  • labels (Int64Tensor[N]): the class label for each ground-truth box

The model returns a Dict[Tensor] during training, containing the classification and regression losses.

During inference, the model requires only the input tensors, and returns the post-processed predictions as a List[Dict[Tensor]], one for each input image. The fields of the Dict are as follows, where N is the number of detections:

  • boxes (FloatTensor[N, 4]): the predicted boxes in [x1, y1, x2, y2] format, with 0 <= x1 < x2 <= W and 0 <= y1 < y2 <= H.

  • labels (Int64Tensor[N]): the predicted labels for each detection

  • scores (Tensor[N]): the scores of each detection

For more details on the output, you may refer to Instance segmentation models.

Example:

>>> model = torchvision.models.detection.retinanet_resnet50_fpn(weights=RetinaNet_ResNet50_FPN_Weights.DEFAULT)
>>> model.eval()
>>> x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
>>> predictions = model(x)
Parameters:
  • weights (RetinaNet_ResNet50_FPN_Weights, optional) – The pretrained weights to use. See RetinaNet_ResNet50_FPN_Weights below for more details, and possible values. By default, no pre-trained weights are used.

  • progress (bool) – If True, displays a progress bar of the download to stderr. Default is True.

  • num_classes (int, optional) – number of output classes of the model (including the background)

  • weights_backbone (ResNet50_Weights, optional) – The pretrained weights for the backbone.

  • trainable_backbone_layers (int, optional) – number of trainable (not frozen) layers starting from final block. Valid values are between 0 and 5, with 5 meaning all backbone layers are trainable. If None is passed (the default) this value is set to 3.

  • **kwargs – parameters passed to the torchvision.models.detection.RetinaNet base class. Please refer to the source code for more details about this class.

class torchvision.models.detection.RetinaNet_ResNet50_FPN_Weights(value)[source]

The model builder above accepts the following values as the weights parameter. RetinaNet_ResNet50_FPN_Weights.DEFAULT is equivalent to RetinaNet_ResNet50_FPN_Weights.COCO_V1. You can also use strings, e.g. weights='DEFAULT' or weights='COCO_V1'.

RetinaNet_ResNet50_FPN_Weights.COCO_V1:

These weights were produced by following a similar training recipe as on the paper. Also available as RetinaNet_ResNet50_FPN_Weights.DEFAULT.

box_map (on COCO-val2017)

36.4

categories

__background__, person, bicycle, … (88 omitted)

min_size

height=1, width=1

num_params

34014999

recipe

link

GFLOPS

151.54

File size

130.3 MB

The inference transforms are available at RetinaNet_ResNet50_FPN_Weights.COCO_V1.transforms and perform the following preprocessing operations: Accepts PIL.Image, batched (B, C, H, W) and single (C, H, W) image torch.Tensor objects. The images are rescaled to [0.0, 1.0].

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources