Cascade R-CNN: Delving into High Quality Object Detection

CVPR 2018  ·  Zhaowei Cai, Nuno Vasconcelos ·

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An object detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection performance tends to degrade with increasing the IoU thresholds. Two main factors are responsible for this: 1) overfitting during training, due to exponentially vanishing positive samples, and 2) inference-time mismatch between the IoUs for which the detector is optimal and those of the input hypotheses. A multi-stage object detection architecture, the Cascade R-CNN, is proposed to address these problems. It consists of a sequence of detectors trained with increasing IoU thresholds, to be sequentially more selective against close false positives. The detectors are trained stage by stage, leveraging the observation that the output of a detector is a good distribution for training the next higher quality detector. The resampling of progressively improved hypotheses guarantees that all detectors have a positive set of examples of equivalent size, reducing the overfitting problem. The same cascade procedure is applied at inference, enabling a closer match between the hypotheses and the detector quality of each stage. A simple implementation of the Cascade R-CNN is shown to surpass all single-model object detectors on the challenging COCO dataset. Experiments also show that the Cascade R-CNN is widely applicable across detector architectures, achieving consistent gains independently of the baseline detector strength. The code will be made available at https://github.com/zhaoweicai/cascade-rcnn.

PDF Abstract CVPR 2018 PDF CVPR 2018 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Object Detection AI-TOD Cascade R-CNN (ResNet-50-FPN) AP 13.8 # 4
AP50 30.8 # 4
AP75 10.5 # 3
APvt 0.0 # 3
APt 10.6 # 4
APs 25.5 # 3
APm 26.6 # 3
Object Detection COCO minival Cascade R-CNN (ResNet-50-FPN+) box AP 40.3 # 164
AP50 59.4 # 82
AP75 43.7 # 73
APS 22.9 # 63
APM 43.7 # 61
APL 54.1 # 60
Object Detection COCO minival Cascade R-CNN (ResNet-101-FPN+, cascade) box AP 42.7 # 136
AP50 61.6 # 65
AP75 46.6 # 50
APS 23.8 # 57
APM 46.2 # 42
APL 57.4 # 47
Object Detection COCO test-dev Cascade R-CNN (ResNet-101-FPN+, cascade) box mAP 42.8 # 162
AP50 62.1 # 114
AP75 46.3 # 112
APS 23.7 # 107
APM 45.5 # 105
APL 55.2 # 99
Hardware Burden None # 1
Operations per network pass None # 1
Object Detection COCO test-dev Cascade R-CNN (ResNet-50-FPN+, cascade) box mAP 40.6 # 183
AP50 59.9 # 132
AP75 44 # 132
APS 22.6 # 118
APM 42.7 # 126
APL 52.1 # 123
Hardware Burden 12G # 1
Operations per network pass None # 1
Object Detection COCO test-dev Cascade R-CNN (ResNet-50-FPN+) box mAP 36.5 # 215
AP50 59 # 141
AP75 39.2 # 151
APS 20.3 # 134
APM 38.8 # 141
APL 46.4 # 143
Hardware Burden 3G # 1
Operations per network pass None # 1
Object Detection COCO test-dev Cascade R-CNN (ResNet-101-FPN+) box mAP 38.8 # 203
AP50 61.1 # 121
AP75 41.9 # 142
APS 21.3 # 130
APM 41.8 # 133
APL 49.8 # 140
Hardware Burden 3G # 1
Operations per network pass None # 1
2D Object Detection SARDet-100K Cascade R-CNN box mAP 51.1 # 4

Methods