Deep High-Resolution Representation Learning for Visual Recognition

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions \emph{in series} (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
RESULT BENCHMARK
Semantic Segmentation CamVid HRNetV2 (HRNetV2-W48) Mean IoU 78.47% # 2
Semantic Segmentation Cityscapes val HRNetV2 (HRNetV2-W48) mIoU 81.1% # 3
Semantic Segmentation Cityscapes val HRNetV2 (HRNetV2-W40) mIoU 80.2% # 7
Object Detection COCO minival Cascade R-CNN (HRNetV2p-W48) box AP 44.6 # 22
AP50 62.7 # 20
AP75 48.7 # 13
APS 26.3 # 16
APM 48.1 # 11
APL 58.5 # 14
Object Detection COCO minival Faster R-CNN (HRNetV2p-W32) box AP 40.9 # 42
AP50 61.8 # 24
AP75 44.8 # 28
APS 24.4 # 25
APM 43.7 # 30
APL 53.3 # 28
Object Detection COCO minival Faster R-CNN (HRNetV2p-W18) box AP 38.0 # 61
AP50 58.9 # 39
AP75 41.5 # 42
APS 22.6 # 32
APM 40.8 # 39
APL 49.6 # 41
Object Detection COCO minival HTC (HRNetV2p-W48) box AP 47.0 # 12
APS 28.8 # 5
APM 50.3 # 5
APL 62.2 # 5
Object Detection COCO minival Mask R-CNN (HRNetV2p-W48, cascade) box AP 46.0 # 16
APS 27.5 # 10
APM 48.9 # 7
APL 60.1 # 8
Object Detection COCO minival HTC (HRNetV2p-W32) box AP 45.3 # 18
APS 27 # 12
APM 48.4 # 9
APL 59.5 # 10
Object Detection COCO minival Mask R-CNN (HRNetV2p-W32, cascade) box AP 44.5 # 23
APS 26.1 # 17
APM 47.9 # 12
APL 58.5 # 14
Object Detection COCO minival HTC (HRNetV2p-W18) box AP 43.1 # 30
APS 26.6 # 15
APM 46 # 20
APL 56.9 # 20
Object Detection COCO minival Mask R-CNN (HRNetV2p-W32) box AP 42.3 # 34
APS 25.0 # 23
APM 45.4 # 22
APL 54.9 # 25
Instance Segmentation COCO minival HTC (HRNetV2p-W48) mask AP 41.0 # 9
Object Detection COCO minival Mask R-CNN (HRNetV2p-W18) box AP 39.2 # 54
APS 23.7 # 28
APM 41.7 # 36
APL 51.0 # 37
Object Detection COCO minival Faster R-CNN (HRNetV2p-W48) box AP 41.8 # 36
AP50 62.8 # 19
AP75 45.9 # 23
APS 25.0 # 23
APM 44.7 # 23
APL 54.6 # 26
Object Detection COCO minival Cascade R-CNN (HRNetV2p-W18) box AP 41.3 # 40
AP50 59.2 # 38
AP75 44.9 # 27
APS 23.7 # 28
APM 44.2 # 27
APL 54.1 # 27
Object Detection COCO test-dev HTC (HRNetV2p-W48) box AP 47.3 # 31
AP50 65.9 # 37
AP75 51.2 # 37
APS 28.0 # 40
APM 49.7 # 40
APL 59.8 # 32
Object Detection COCO test-dev Mask R-CNN (HRNetV2p-W32 + cascade) box AP 44.7 # 48
AP50 62.5 # 60
AP75 48.6 # 51
APS 25.8 # 54
APM 47.1 # 56
APL 56.3 # 50
Object Detection COCO test-dev CenterNet (HRNetV2-W48) box AP 43.5 # 54
AP50 62.1 # 63
AP75 46.5 # 63
APS 22.2 # 74
APM 46.5 # 61
APL 57.8 # 40
Object Detection COCO test-dev Cascade R-CNN (HRNetV2p-W48) box AP 44.8 # 47
AP50 63.1 # 56
AP75 48.6 # 51
APS 26.0 # 52
APM 47.3 # 54
APL 56.3 # 50
Object Detection COCO test-dev FCOS (HRNetV2p-W48) box AP 40.5 # 72
AP50 59.3 # 76
AP75 43.3 # 82
APS 23.4 # 68
APM 42.6 # 78
APL 51.0 # 76
Object Detection COCO test-dev Mask R-CNN (HRNetV2p-W48 + cascade) box AP 46.1 # 39
AP50 64.0 # 50
AP75 50.3 # 44
APS 27.1 # 44
APM 48.6 # 46
APL 58.3 # 37
Object Detection COCO test-dev Faster R-CNN (HRNetV2p-W48) box AP 42.4 # 61
AP50 63.6 # 53
AP75 46.4 # 64
APS 24.9 # 59
APM 44.6 # 70
APL 53.0 # 66
Semantic Segmentation PASCAL Context CFNet (ResNet-101) mIoU 54.0 # 9
Semantic Segmentation PASCAL Context HRNetV2 (HRNetV2-W48) mIoU 54.0 # 9

Methods used in the Paper