HRNet

Last updated on Feb 23, 2021

Cascade Mask R-CNN (HRNetV2p-W18, 20e, pytorch)

Memory (M) 8500.0
inference time (s/im) 0.11765
File Size 240.98 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, FPN, RoIAlign
lr sched 20e
Memory (M) 8500.0
inference time (s/im) 0.11765
SHOW MORE
SHOW LESS
Cascade Mask R-CNN (HRNetV2p-W32, 20e, pytorch)

lr sched 20e
inference time (s/im) 0.12048
File Size 316.60 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, FPN, RoIAlign
lr sched 20e
inference time (s/im) 0.12048
SHOW MORE
SHOW LESS
Cascade Mask R-CNN (HRNetV2p-W40, 20e, pytorch)

Memory (M) 12500.0
FLOPs
File Size 378.62 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, FPN, RoIAlign
lr sched 20e
Memory (M) 12500.0
SHOW MORE
SHOW LESS
Cascade R-CNN (HRNetV2p-W18, 20e, pytorch)

Memory (M) 7000.0
inference time (s/im) 0.09091
File Size 210.73 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RoIAlign, RPN, HRNet, Cascade R-CNN
lr sched 20e
Memory (M) 7000.0
inference time (s/im) 0.09091
SHOW MORE
SHOW LESS
Cascade R-CNN (HRNetV2p-W32, 20e, pytorch)

Memory (M) 9400.0
inference time (s/im) 0.09091
File Size 286.34 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RoIAlign, RPN, HRNet, Cascade R-CNN
lr sched 20e
Memory (M) 9400.0
inference time (s/im) 0.09091
SHOW MORE
SHOW LESS
Cascade R-CNN (HRNetV2p-W40, 20e, pytorch)

Memory (M) 10800.0
FLOPs
File Size 348.37 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RoIAlign, RPN, HRNet, Cascade R-CNN
lr sched 20e
Memory (M) 10800.0
SHOW MORE
SHOW LESS
Faster R-CNN (HRNetV2p-W18, 1x, pytorch)

Memory (M) 6600.0
inference time (s/im) 0.07463
File Size 105.28 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, RoIPool
lr sched 1x
Memory (M) 6600.0
inference time (s/im) 0.07463
SHOW MORE
SHOW LESS
Faster R-CNN (HRNetV2p-W18, 2x, pytorch)

Memory (M) 6600.0
FLOPs
File Size 105.28 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, RoIPool
lr sched 2x
Memory (M) 6600.0
SHOW MORE
SHOW LESS
Faster R-CNN (HRNetV2p-W32, 1x, pytorch)

Memory (M) 9000.0
inference time (s/im) 0.08065
File Size 180.89 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, RoIPool
lr sched 1x
Memory (M) 9000.0
inference time (s/im) 0.08065
SHOW MORE
SHOW LESS
Faster R-CNN (HRNetV2p-W32, 2x, pytorch)

Memory (M) 9000.0
FLOPs
File Size 180.89 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, RoIPool
lr sched 2x
Memory (M) 9000.0
SHOW MORE
SHOW LESS
Faster R-CNN (HRNetV2p-W40, 1x, pytorch)

Memory (M) 10400.0
inference time (s/im) 0.09524
File Size 242.91 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, RoIPool
lr sched 1x
Memory (M) 10400.0
inference time (s/im) 0.09524
SHOW MORE
SHOW LESS
Faster R-CNN (HRNetV2p-W40, 2x, pytorch)

Memory (M) 10400.0
FLOPs
File Size 242.92 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, RoIPool
lr sched 2x
Memory (M) 10400.0
SHOW MORE
SHOW LESS
FCOS (HRNetV2p-W18, 1x, pytorch, GN=Y, MS train=N)

Memory (M) 13000.0
inference time (s/im) 0.07752
File Size 67.21 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Group Normalization, Non Maximum Suppression, FPN, HRNet
MS train N
lr sched 1x
Memory (M) 13000.0
inference time (s/im) 0.07752
SHOW MORE
SHOW LESS
FCOS (HRNetV2p-W18, 2x, pytorch, GN=Y, MS train=N)

Memory (M) 13000.0
FLOPs
File Size 67.21 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Group Normalization, Non Maximum Suppression, FPN, HRNet
MS train N
lr sched 2x
Memory (M) 13000.0
SHOW MORE
SHOW LESS
FCOS (HRNetV2p-W18, 2x, pytorch, GN=Y, MS train=Y)

Memory (M) 13000.0
inference time (s/im) 0.07752
File Size 67.21 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Group Normalization, Non Maximum Suppression, FPN, HRNet
MS train Y
lr sched 2x
Memory (M) 13000.0
inference time (s/im) 0.07752
SHOW MORE
SHOW LESS
FCOS (HRNetV2p-W32, 1x, pytorch, GN=Y, MS train=N)

Memory (M) 17500.0
inference time (s/im) 0.07752
File Size 142.82 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Group Normalization, Non Maximum Suppression, FPN, HRNet
MS train N
lr sched 1x
Memory (M) 17500.0
inference time (s/im) 0.07752
SHOW MORE
SHOW LESS
FCOS (HRNetV2p-W32, 2x, pytorch, GN=Y, MS train=N)

Memory (M) 17500.0
FLOPs
File Size 142.82 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Group Normalization, Non Maximum Suppression, FPN, HRNet
MS train N
lr sched 2x
Memory (M) 17500.0
SHOW MORE
SHOW LESS
FCOS (HRNetV2p-W32, 2x, pytorch, GN=Y, MS train=Y)

Memory (M) 17500.0
inference time (s/im) 0.08065
File Size 142.82 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Group Normalization, Non Maximum Suppression, FPN, HRNet
MS train Y
lr sched 2x
Memory (M) 17500.0
inference time (s/im) 0.08065
SHOW MORE
SHOW LESS
FCOS (HRNetV2p-W48, 2x, pytorch, GN=Y, MS train=Y)

Memory (M) 20300.0
inference time (s/im) 0.09259
File Size 204.84 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Group Normalization, Non Maximum Suppression, FPN, HRNet
MS train Y
lr sched 2x
Memory (M) 20300.0
inference time (s/im) 0.09259
SHOW MORE
SHOW LESS
HTC (HRNetV2p-W18, 20e, pytorch)

Memory (M) 10800.0
inference time (s/im) 0.21277
File Size 252.18 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RPN, HRNet, Convolution, FPN, 1x1 Convolution, HTC, RoIAlign
lr sched 20e
Memory (M) 10800.0
inference time (s/im) 0.21277
SHOW MORE
SHOW LESS
HTC (HRNetV2p-W32, 20e, pytorch)

Memory (M) 13100.0
inference time (s/im) 0.20408
File Size 327.79 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RPN, HRNet, Convolution, FPN, 1x1 Convolution, HTC, RoIAlign
lr sched 20e
Memory (M) 13100.0
inference time (s/im) 0.20408
SHOW MORE
SHOW LESS
HTC (HRNetV2p-W40, 20e, pytorch)

Memory (M) 14600.0
FLOPs
File Size 389.82 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture RPN, HRNet, Convolution, FPN, 1x1 Convolution, HTC, RoIAlign
lr sched 20e
Memory (M) 14600.0
SHOW MORE
SHOW LESS
Mask R-CNN (HRNetV2p-W18, 1x, pytorch)

Memory (M) 7000.0
inference time (s/im) 0.08547
File Size 115.36 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, RoIAlign
lr sched 1x
Memory (M) 7000.0
inference time (s/im) 0.08547
SHOW MORE
SHOW LESS
Mask R-CNN (HRNetV2p-W18, 2x, pytorch)

Memory (M) 7000.0
FLOPs
File Size 115.36 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, RoIAlign
lr sched 2x
Memory (M) 7000.0
SHOW MORE
SHOW LESS
Mask R-CNN (HRNetV2p-W32, 1x, pytorch)

Memory (M) 9400.0
inference time (s/im) 0.0885
File Size 190.97 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, RoIAlign
lr sched 1x
Memory (M) 9400.0
inference time (s/im) 0.0885
SHOW MORE
SHOW LESS
Mask R-CNN (HRNetV2p-W32, 2x, pytorch)

Memory (M) 9400.0
FLOPs
File Size 190.97 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, RoIAlign
lr sched 2x
Memory (M) 9400.0
SHOW MORE
SHOW LESS
Mask R-CNN (HRNetV2p-W40, 1x, pytorch)

Memory (M) 10900.0
FLOPs
File Size 253.01 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, RoIAlign
lr sched 1x
Memory (M) 10900.0
SHOW MORE
SHOW LESS
Mask R-CNN (HRNetV2p-W40, 2x, pytorch)

Memory (M) 10900.0
FLOPs
File Size 253.01 MB
Training Data MS COCO
Training Resources 8x NVIDIA V100 GPUs
Training Time

Architecture Softmax, RPN, HRNet, Convolution, Dense Connections, RoIAlign
lr sched 2x
Memory (M) 10900.0
SHOW MORE
SHOW LESS
README.md

High-resolution networks (HRNets) for object detection

Introduction

[ALGORITHM]

@inproceedings{SunXLW19,
  title={Deep High-Resolution Representation Learning for Human Pose Estimation},
  author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
  booktitle={CVPR},
  year={2019}
}

@article{SunZJCXLMWLW19,
  title={High-Resolution Representations for Labeling Pixels and Regions},
  author={Ke Sun and Yang Zhao and Borui Jiang and Tianheng Cheng and Bin Xiao
  and Dong Liu and Yadong Mu and Xinggang Wang and Wenyu Liu and Jingdong Wang},
  journal   = {CoRR},
  volume    = {abs/1904.04514},
  year={2019}
}

Results and Models

Faster R-CNN

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP Config Download
HRNetV2p-W18 pytorch 1x 6.6 13.4 36.9 config model | log
HRNetV2p-W18 pytorch 2x 6.6 38.9 config model | log
HRNetV2p-W32 pytorch 1x 9.0 12.4 40.2 config model | log
HRNetV2p-W32 pytorch 2x 9.0 41.4 config model | log
HRNetV2p-W40 pytorch 1x 10.4 10.5 41.2 config model | log
HRNetV2p-W40 pytorch 2x 10.4 42.1 config model | log

Mask R-CNN

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
HRNetV2p-W18 pytorch 1x 7.0 11.7 37.7 34.2 config model | log
HRNetV2p-W18 pytorch 2x 7.0 - 39.8 36.0 config model | log
HRNetV2p-W32 pytorch 1x 9.4 11.3 41.2 37.1 config model | log
HRNetV2p-W32 pytorch 2x 9.4 - 42.5 37.8 config model | log
HRNetV2p-W40 pytorch 1x 10.9 42.1 37.5 config model | log
HRNetV2p-W40 pytorch 2x 10.9 42.8 38.2 config model | log

Cascade R-CNN

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP Config Download
HRNetV2p-W18 pytorch 20e 7.0 11.0 41.2 config model | log
HRNetV2p-W32 pytorch 20e 9.4 11.0 43.3 config model | log
HRNetV2p-W40 pytorch 20e 10.8 43.8 config model | log

Cascade Mask R-CNN

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
HRNetV2p-W18 pytorch 20e 8.5 8.5 41.6 36.4 config model | log
HRNetV2p-W32 pytorch 20e 8.3 44.3 38.6 config model | log
HRNetV2p-W40 pytorch 20e 12.5 45.1 39.3 config model | log

Hybrid Task Cascade (HTC)

Backbone Style Lr schd Mem (GB) Inf time (fps) box AP mask AP Config Download
HRNetV2p-W18 pytorch 20e 10.8 4.7 42.8 37.9 config model | log
HRNetV2p-W32 pytorch 20e 13.1 4.9 45.4 39.9 config model | log
HRNetV2p-W40 pytorch 20e 14.6 46.4 40.8 config model | log

FCOS

Backbone Style GN MS train Lr schd Mem (GB) Inf time (fps) box AP Config Download
HRNetV2p-W18 pytorch Y N 1x 13.0 12.9 35.3 config model | log
HRNetV2p-W18 pytorch Y N 2x 13.0 - 38.2 config model | log
HRNetV2p-W32 pytorch Y N 1x 17.5 12.9 39.5 config model | log
HRNetV2p-W32 pytorch Y N 2x 17.5 - 40.8 config model | log
HRNetV2p-W18 pytorch Y Y 2x 13.0 12.9 38.3 config model | log
HRNetV2p-W32 pytorch Y Y 2x 17.5 12.4 41.9 config model | log
HRNetV2p-W48 pytorch Y Y 2x 20.3 10.8 42.7 config model | log

Note:

  • The 28e schedule in HTC indicates decreasing the lr at 24 and 27 epochs, with a total of 28 epochs.
  • HRNetV2 ImageNet pretrained models are in HRNets for Image Classification.

Results

Object Detection on COCO minival

Object Detection on COCO minival
MODEL BOX AP
HTC (HRNetV2p-W40, 20e, pytorch) 46.4
HTC (HRNetV2p-W32, 20e, pytorch) 45.4
Cascade Mask R-CNN (HRNetV2p-W40, 20e, pytorch) 45.1
Cascade Mask R-CNN (HRNetV2p-W32, 20e, pytorch) 44.3
Cascade R-CNN (HRNetV2p-W40, 20e, pytorch) 43.8
Cascade R-CNN (HRNetV2p-W32, 20e, pytorch) 43.3
Mask R-CNN (HRNetV2p-W40, 2x, pytorch) 42.8
HTC (HRNetV2p-W18, 20e, pytorch) 42.8
FCOS (HRNetV2p-W48, 2x, pytorch, GN=Y, MS train=Y) 42.7
Mask R-CNN (HRNetV2p-W32, 2x, pytorch) 42.5
Faster R-CNN (HRNetV2p-W40, 2x, pytorch) 42.1
Mask R-CNN (HRNetV2p-W40, 1x, pytorch) 42.1
FCOS (HRNetV2p-W32, 2x, pytorch, GN=Y, MS train=Y) 41.9
Cascade Mask R-CNN (HRNetV2p-W18, 20e, pytorch) 41.6
Faster R-CNN (HRNetV2p-W32, 2x, pytorch) 41.4
Cascade R-CNN (HRNetV2p-W18, 20e, pytorch) 41.2
Faster R-CNN (HRNetV2p-W40, 1x, pytorch) 41.2
Mask R-CNN (HRNetV2p-W32, 1x, pytorch) 41.2
FCOS (HRNetV2p-W32, 2x, pytorch, GN=Y, MS train=N) 40.8
Faster R-CNN (HRNetV2p-W32, 1x, pytorch) 40.2
Mask R-CNN (HRNetV2p-W18, 2x, pytorch) 39.8
FCOS (HRNetV2p-W32, 1x, pytorch, GN=Y, MS train=N) 39.5
Faster R-CNN (HRNetV2p-W18, 2x, pytorch) 38.9
FCOS (HRNetV2p-W18, 2x, pytorch, GN=Y, MS train=Y) 38.3
FCOS (HRNetV2p-W18, 2x, pytorch, GN=Y, MS train=N) 38.2
Mask R-CNN (HRNetV2p-W18, 1x, pytorch) 37.7
Faster R-CNN (HRNetV2p-W18, 1x, pytorch) 36.9
FCOS (HRNetV2p-W18, 1x, pytorch, GN=Y, MS train=N) 35.3
Instance Segmentation on COCO minival
MODEL MASK AP
HTC (HRNetV2p-W40, 20e, pytorch) 40.8
HTC (HRNetV2p-W32, 20e, pytorch) 39.9
Cascade Mask R-CNN (HRNetV2p-W40, 20e, pytorch) 39.3
Cascade Mask R-CNN (HRNetV2p-W32, 20e, pytorch) 38.6
Mask R-CNN (HRNetV2p-W40, 2x, pytorch) 38.2
HTC (HRNetV2p-W18, 20e, pytorch) 37.9
Mask R-CNN (HRNetV2p-W32, 2x, pytorch) 37.8
Mask R-CNN (HRNetV2p-W40, 1x, pytorch) 37.5
Mask R-CNN (HRNetV2p-W32, 1x, pytorch) 37.1
Cascade Mask R-CNN (HRNetV2p-W18, 20e, pytorch) 36.4
Mask R-CNN (HRNetV2p-W18, 2x, pytorch) 36.0
Mask R-CNN (HRNetV2p-W18, 1x, pytorch) 34.2