Res2Net

Last updated on Feb 23, 2021

Cascade Mask R-CNN (R2-101-FPN, 20e, pytorch)

Memory (M) 9500.0
Backbone Layers 101
File Size 370.64 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Dense Connections, FPN, Res2Net, RoIAlign
lr sched 20e
Memory (M) 9500.0
Backbone Layers 101
SHOW MORE
SHOW LESS
Cascade R-CNN (R2-101-FPN, 20e, pytorch)

Memory (M) 7800.0
Backbone Layers 101
File Size 340.39 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RPN, FPN, Res2Net, Cascade R-CNN, RoIAlign
lr sched 20e
Memory (M) 7800.0
Backbone Layers 101
SHOW MORE
SHOW LESS
Faster R-CNN (R2-101-FPN, 2x, pytorch)

Memory (M) 7400.0
Backbone Layers 101
File Size 234.94 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, FPN, Res2Net, RoIPool
lr sched 2x
Memory (M) 7400.0
Backbone Layers 101
SHOW MORE
SHOW LESS
HTC (R2-101-FPN, 20e, pytorch)

lr sched 20e
Backbone Layers 101
File Size 381.84 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RPN, Convolution, FPN, Res2Net, 1x1 Convolution, HTC, RoIAlign
lr sched 20e
Backbone Layers 101
SHOW MORE
SHOW LESS
Mask R-CNN (R2-101-FPN, 2x, pytorch)

Memory (M) 7900.0
Backbone Layers 101
File Size 245.02 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, Convolution, Dense Connections, FPN, Res2Net, RoIAlign
lr sched 2x
Memory (M) 7900.0
Backbone Layers 101
SHOW MORE
SHOW LESS
README.md

Res2Net for object detection and instance segmentation

Introduction

[ALGORITHM]

We propose a novel building block for CNNs, namely Res2Net, by constructing hierarchical residual-like connections within one single residual block. The Res2Net represents multi-scale features at a granular level and increases the range of receptive fields for each network layer.

Backbone Params. GFLOPs top-1 err. top-5 err.
ResNet-101 44.6 M 7.8 22.63 6.44
ResNeXt-101-64x4d 83.5M 15.5 20.40 -
HRNetV2p-W48 77.5M 16.1 20.70 5.50
Res2Net-101 45.2M 8.3 18.77 4.64

Compared with other backbone networks, Res2Net requires fewer parameters and FLOPs.

Note:

  • GFLOPs for classification are calculated with image size (224x224).
@article{gao2019res2net,
  title={Res2Net: A New Multi-scale Backbone Architecture},
  author={Gao, Shang-Hua and Cheng, Ming-Ming and Zhao, Kai and Zhang, Xin-Yu and Yang, Ming-Hsuan and Torr, Philip},
  journal={IEEE TPAMI},
  year={2020},
  doi={10.1109/TPAMI.2019.2938758},
}

Results and Models

Faster R-CNN

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP Config Download
R2-101-FPN pytorch 2x 7.4 - 43.0 config model | log

Mask R-CNN

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
R2-101-FPN pytorch 2x 7.9 - 43.6 38.7 config model | log

Cascade R-CNN

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP Config Download
R2-101-FPN pytorch 20e 7.8 - 45.7 config model | log

Cascade Mask R-CNN

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
R2-101-FPN pytorch 20e 9.5 - 46.4 40.0 config model | log

Hybrid Task Cascade (HTC)

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
R2-101-FPN pytorch 20e - - 47.5 41.6 config model | log

Results

Object Detection on COCO minival
MODEL BOX AP
HTC (R2-101-FPN, 20e, pytorch) 47.5
Cascade Mask R-CNN (R2-101-FPN, 20e, pytorch) 46.4
Cascade R-CNN (R2-101-FPN, 20e, pytorch) 45.7
Mask R-CNN (R2-101-FPN, 2x, pytorch) 43.6
Faster R-CNN (R2-101-FPN, 2x, pytorch) 43.0
Instance Segmentation on COCO minival
MODEL MASK AP
HTC (R2-101-FPN, 20e, pytorch) 41.6
Cascade Mask R-CNN (R2-101-FPN, 20e, pytorch) 40.0
Mask R-CNN (R2-101-FPN, 2x, pytorch) 38.7