Training Techniques | Weight Decay, SGD with Momentum |
---|---|
Architecture | RPN, RoIPool, FPN, Feedforward Network, 1x1 Convolution, Batch Normalization, Convolution, Dense Connections, Depthwise Separable Convolution, Dropout, Global Average Pooling, Hard Swish, Inverted Residual Block, Residual Connection, ReLU, Softmax, Squeeze-and-Excitation Block |
ID | fasterrcnn_mobilenet_v3_large_320_fpn |
SHOW MORE |
Training Techniques | Weight Decay, SGD with Momentum |
---|---|
Architecture | RPN, RoIPool, FPN, Feedforward Network, 1x1 Convolution, Batch Normalization, Convolution, Dense Connections, Depthwise Separable Convolution, Dropout, Global Average Pooling, Hard Swish, Inverted Residual Block, Residual Connection, ReLU, Softmax, Squeeze-and-Excitation Block |
ID | fasterrcnn_mobilenet_v3_large_fpn |
SHOW MORE |
Training Techniques | Weight Decay, SGD with Momentum |
---|---|
Architecture | RPN, RoIPool, FPN, Feedforward Network, 1x1 Convolution, Bottleneck Residual Block, Batch Normalization, Convolution, Global Average Pooling, Residual Block, Residual Connection, ReLU, Max Pooling, Softmax, Non Maximum Suppression |
ID | fasterrcnn_resnet50_fpn |
SHOW MORE |
Faster R-CNN is an object detection model that improves on Fast R-CNN by utilising a region proposal network (RPN) with the CNN model. The RPN shares full-image convolutional features with the detection network, enabling nearly cost-free region proposals. It is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. RPN and Fast R-CNN are merged into a single network by sharing their convolutional features: the RPN component tells the unified network where to look.
As a whole, Faster R-CNN consists of two modules. The first module is a deep fully convolutional network that proposes regions, and the second module is the Fast R-CNN detector that uses the proposed regions.
To load a pretrained model:
import torchvision.models as models
fasterrcnn_resnet50_fpn = models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
Replace the model name with the variant you want to use, e.g. fasterrcnn_resnet50_fpn
. You can find
the IDs in the model summaries at the top of this page.
To evaluate the model, use the object detection recipes from the library.
You can follow the torchvision recipe on GitHub for training a new model afresh.
@article{DBLP:journals/corr/RenHG015,
author = {Shaoqing Ren and
Kaiming He and
Ross B. Girshick and
Jian Sun},
title = {Faster {R-CNN:} Towards Real-Time Object Detection with Region Proposal
Networks},
journal = {CoRR},
volume = {abs/1506.01497},
year = {2015},
url = {http://arxiv.org/abs/1506.01497},
archivePrefix = {arXiv},
eprint = {1506.01497},
timestamp = {Mon, 13 Aug 2018 16:46:02 +0200},
biburl = {https://dblp.org/rec/journals/corr/RenHG015.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
If you use the MobileNet V3 backbone:
@article{DBLP:journals/corr/abs-1905-02244,
author = {Andrew Howard and
Mark Sandler and
Grace Chu and
Liang{-}Chieh Chen and
Bo Chen and
Mingxing Tan and
Weijun Wang and
Yukun Zhu and
Ruoming Pang and
Vijay Vasudevan and
Quoc V. Le and
Hartwig Adam},
title = {Searching for MobileNetV3},
journal = {CoRR},
volume = {abs/1905.02244},
year = {2019},
url = {http://arxiv.org/abs/1905.02244},
archivePrefix = {arXiv},
eprint = {1905.02244},
timestamp = {Tue, 12 Jan 2021 15:30:06 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-1905-02244.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
BENCHMARK | MODEL | METRIC NAME | METRIC VALUE | GLOBAL RANK |
---|---|---|---|---|
COCO minival | Faster R-CNN ResNet-50 FPN | box AP | 37.0 | # 103 |
COCO minival | Faster R-CNN MobileNetV3-Large FPN | box AP | 32.8 | # 115 |
COCO minival | Faster R-CNN MobileNetV3-Large 320 FPN | box AP | 22.8 | # 121 |