ID	138204752
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	101
train time (s/iter)	0.619
Training Memory (GB)	5.9
inference time (s/im)	0.139

ID	138204841
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	101
train time (s/iter)	0.452
Training Memory (GB)	6.1
inference time (s/im)	0.086

ID	137851257
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	101
train time (s/iter)	0.286
Training Memory (GB)	4.1
inference time (s/im)	0.051

ID	137257644
Max Iter	90000
lr sched	1x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.551
Training Memory (GB)	4.8
inference time (s/im)	0.102

ID	137849393
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.543
Training Memory (GB)	4.8
inference time (s/im)	0.104

ID	137847829
Max Iter	90000
lr sched	1x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.38
Training Memory (GB)	5.0
inference time (s/im)	0.068

ID	137849425
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.378
Training Memory (GB)	5.0
inference time (s/im)	0.07

ID	137257794
Max Iter	90000
lr sched	1x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.21
Training Memory (GB)	3.0
inference time (s/im)	0.038

ID	137849458
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.209
Training Memory (GB)	3.0
inference time (s/im)	0.038

ID	139173657
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	101
train time (s/iter)	0.638
Training Memory (GB)	6.7
inference time (s/im)	0.098

Faster R-CNN

facebookresearch / detectron2

Last updated on Feb 19, 2021

Parameters 53 Million

FLOPs 888 Billion

File Size 202.02 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 1.93 days

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	138204752
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	101
train time (s/iter)	0.619
Training Memory (GB)	5.9
inference time (s/im)	0.139
SHOW MORE
SHOW LESS

Parameters 184 Million

FLOPs 411 Billion

File Size 704.76 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 1.41 days

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	138204841
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	101
train time (s/iter)	0.452
Training Memory (GB)	6.1
inference time (s/im)	0.086
SHOW MORE
SHOW LESS

Parameters 61 Million

FLOPs 246 Billion

File Size 232.20 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 21 hours

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	137851257
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	101
train time (s/iter)	0.286
Training Memory (GB)	4.1
inference time (s/im)	0.051
SHOW MORE
SHOW LESS

Parameters 34 Million

FLOPs 822 Billion

File Size 129.34 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 14 hours

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	137257644
Max Iter	90000
lr sched	1x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.551
Training Memory (GB)	4.8
inference time (s/im)	0.102
SHOW MORE
SHOW LESS

Parameters 34 Million

FLOPs 822 Billion

File Size 129.34 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 1.7 days

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	137849393
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.543
Training Memory (GB)	4.8
inference time (s/im)	0.104
SHOW MORE
SHOW LESS

Parameters 166 Million

FLOPs 344 Billion

File Size 632.08 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 10 hours

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	137847829
Max Iter	90000
lr sched	1x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.38
Training Memory (GB)	5.0
inference time (s/im)	0.068
SHOW MORE
SHOW LESS

Parameters 166 Million

FLOPs 344 Billion

File Size 632.08 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 1.18 days

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	137849425
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.378
Training Memory (GB)	5.0
inference time (s/im)	0.07
SHOW MORE
SHOW LESS

Parameters 42 Million

FLOPs 180 Billion

File Size 159.52 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 5 hours

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	137257794
Max Iter	90000
lr sched	1x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.21
Training Memory (GB)	3.0
inference time (s/im)	0.038
SHOW MORE
SHOW LESS

Parameters 42 Million

FLOPs 180 Billion

File Size 159.52 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 16 hours

Architecture	Convolution, RoIPool, RPN, Softmax, ResNet
ID	137849458
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	50
train time (s/iter)	0.209
Training Memory (GB)	3.0
inference time (s/im)	0.038
SHOW MORE
SHOW LESS

Parameters 105 Million

FLOPs 406 Billion

File Size 401.34 MB

Training Data MS COCO

Training Resources 8 NVIDIA V100 GPUs

Training Time 1.99 days

Architecture	Convolution, RoIPool, RPN, Softmax, ResNeXt
ID	139173657
Max Iter	270000
lr sched	3x
FLOPs Input No	100
Backbone Layers	101
train time (s/iter)	0.638
Training Memory (GB)	6.7
inference time (s/im)	0.098
SHOW MORE
SHOW LESS

README.md

Summary

Faster R-CNN is an object detection model that improves on Fast R-CNN by utilising a region proposal network (RPN) with the CNN model. The RPN shares full-image convolutional features with the detection network, enabling nearly cost-free region proposals. It is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. RPN and Fast R-CNN are merged into a single network by sharing their convolutional features: the RPN component tells the unified network where to look.

How do I load this model?

There are several Faster R-CNN models available in Detectron2, with different backbones and learning schedules.

To load from the Detectron2 model zoo:

from detectron2 import model_zoo
model = model_zoo.get("COCO-Detection/faster_rcnn_R_50_C4_1x.yaml", trained=True)

Replace the configuration path with the variant you want to use. You can find the paths in the model summaries at the top of this page.

How do I train this model?

You can follow the Getting Started guide on Colab to see how to train a model.

You can also read the official Detectron2 documentation.

Citation

@misc{wu2019detectron2,
  author =       {Yuxin Wu and Alexander Kirillov and Francisco Massa and
                  Wan-Yen Lo and Ross Girshick},
  title =        {Detectron2},
  howpublished = {\url{https://github.com/facebookresearch/detectron2}},
  year =         {2019}
}

Results

Object Detection on COCO minival

MODEL	BOX AP
Faster R-CNN (X101-FPN, 3x)	43.0
Faster R-CNN (R101-FPN, 3x)	42.0
Faster R-CNN (R101-C4, 3x)	41.1
Faster R-CNN (R101-DC5, 3x)	40.6
Faster R-CNN (R50-FPN, 3x)	40.2
Faster R-CNN (R50-DC5, 3x)	39.0
Faster R-CNN (R50-C4, 3x)	38.4
Faster R-CNN (R50-FPN, 1x)	37.9
Faster R-CNN (R50-DC5, 1x)	37.3
Faster R-CNN (R50-C4, 1x)	35.7