Faster R-CNN

Last updated on Feb 19, 2021

Faster R-CNN (R101-C4, 3x)

Parameters 53 Million
FLOPs 888 Billion
File Size 202.02 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 1.93 days

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 138204752
Max Iter 270000
lr sched 3x
FLOPs Input No 100
Backbone Layers 101
train time (s/iter) 0.619
Training Memory (GB) 5.9
inference time (s/im) 0.139
SHOW MORE
SHOW LESS
Faster R-CNN (R101-DC5, 3x)

Parameters 184 Million
FLOPs 411 Billion
File Size 704.76 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 1.41 days

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 138204841
Max Iter 270000
lr sched 3x
FLOPs Input No 100
Backbone Layers 101
train time (s/iter) 0.452
Training Memory (GB) 6.1
inference time (s/im) 0.086
SHOW MORE
SHOW LESS
Faster R-CNN (R101-FPN, 3x)

Parameters 61 Million
FLOPs 246 Billion
File Size 232.20 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 21 hours

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 137851257
Max Iter 270000
lr sched 3x
FLOPs Input No 100
Backbone Layers 101
train time (s/iter) 0.286
Training Memory (GB) 4.1
inference time (s/im) 0.051
SHOW MORE
SHOW LESS
Faster R-CNN (R50-C4, 1x)

Parameters 34 Million
FLOPs 822 Billion
File Size 129.34 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 14 hours

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 137257644
Max Iter 90000
lr sched 1x
FLOPs Input No 100
Backbone Layers 50
train time (s/iter) 0.551
Training Memory (GB) 4.8
inference time (s/im) 0.102
SHOW MORE
SHOW LESS
Faster R-CNN (R50-C4, 3x)

Parameters 34 Million
FLOPs 822 Billion
File Size 129.34 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 1.7 days

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 137849393
Max Iter 270000
lr sched 3x
FLOPs Input No 100
Backbone Layers 50
train time (s/iter) 0.543
Training Memory (GB) 4.8
inference time (s/im) 0.104
SHOW MORE
SHOW LESS
Faster R-CNN (R50-DC5, 1x)

Parameters 166 Million
FLOPs 344 Billion
File Size 632.08 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 10 hours

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 137847829
Max Iter 90000
lr sched 1x
FLOPs Input No 100
Backbone Layers 50
train time (s/iter) 0.38
Training Memory (GB) 5.0
inference time (s/im) 0.068
SHOW MORE
SHOW LESS
Faster R-CNN (R50-DC5, 3x)

Parameters 166 Million
FLOPs 344 Billion
File Size 632.08 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 1.18 days

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 137849425
Max Iter 270000
lr sched 3x
FLOPs Input No 100
Backbone Layers 50
train time (s/iter) 0.378
Training Memory (GB) 5.0
inference time (s/im) 0.07
SHOW MORE
SHOW LESS
Faster R-CNN (R50-FPN, 1x)

Parameters 42 Million
FLOPs 180 Billion
File Size 159.52 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 5 hours

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 137257794
Max Iter 90000
lr sched 1x
FLOPs Input No 100
Backbone Layers 50
train time (s/iter) 0.21
Training Memory (GB) 3.0
inference time (s/im) 0.038
SHOW MORE
SHOW LESS
Faster R-CNN (R50-FPN, 3x)

Parameters 42 Million
FLOPs 180 Billion
File Size 159.52 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 16 hours

Architecture Convolution, RoIPool, RPN, Softmax, ResNet
ID 137849458
Max Iter 270000
lr sched 3x
FLOPs Input No 100
Backbone Layers 50
train time (s/iter) 0.209
Training Memory (GB) 3.0
inference time (s/im) 0.038
SHOW MORE
SHOW LESS
Faster R-CNN (X101-FPN, 3x)

Parameters 105 Million
FLOPs 406 Billion
File Size 401.34 MB
Training Data MS COCO
Training Resources 8 NVIDIA V100 GPUs
Training Time 1.99 days

Architecture Convolution, RoIPool, RPN, Softmax, ResNeXt
ID 139173657
Max Iter 270000
lr sched 3x
FLOPs Input No 100
Backbone Layers 101
train time (s/iter) 0.638
Training Memory (GB) 6.7
inference time (s/im) 0.098
SHOW MORE
SHOW LESS
README.md

Summary

Faster R-CNN is an object detection model that improves on Fast R-CNN by utilising a region proposal network (RPN) with the CNN model. The RPN shares full-image convolutional features with the detection network, enabling nearly cost-free region proposals. It is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. RPN and Fast R-CNN are merged into a single network by sharing their convolutional features: the RPN component tells the unified network where to look.

How do I load this model?

There are several Faster R-CNN models available in Detectron2, with different backbones and learning schedules.

To load from the Detectron2 model zoo:

from detectron2 import model_zoo
model = model_zoo.get("COCO-Detection/faster_rcnn_R_50_C4_1x.yaml", trained=True)

Replace the configuration path with the variant you want to use. You can find the paths in the model summaries at the top of this page.

How do I train this model?

You can follow the Getting Started guide on Colab to see how to train a model.

You can also read the official Detectron2 documentation.

Citation

@misc{wu2019detectron2,
  author =       {Yuxin Wu and Alexander Kirillov and Francisco Massa and
                  Wan-Yen Lo and Ross Girshick},
  title =        {Detectron2},
  howpublished = {\url{https://github.com/facebookresearch/detectron2}},
  year =         {2019}
}

Results

Object Detection on COCO minival

Object Detection on COCO minival
MODEL BOX AP
Faster R-CNN (X101-FPN, 3x) 43.0
Faster R-CNN (R101-FPN, 3x) 42.0
Faster R-CNN (R101-C4, 3x) 41.1
Faster R-CNN (R101-DC5, 3x) 40.6
Faster R-CNN (R50-FPN, 3x) 40.2
Faster R-CNN (R50-DC5, 3x) 39.0
Faster R-CNN (R50-C4, 3x) 38.4
Faster R-CNN (R50-FPN, 1x) 37.9
Faster R-CNN (R50-DC5, 1x) 37.3
Faster R-CNN (R50-C4, 1x) 35.7