TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Instance Segmentation	BDD100K val	Mask R-CNN	AP	20.5	# 3
Nuclear Segmentation	Cell17	Mask R-CNN	F1-score	0.8004	# 2
Nuclear Segmentation	Cell17	Mask R-CNN	Dice	0.707	# 2
Nuclear Segmentation	Cell17	Mask R-CNN	Hausdorff	12.6723	# 2
Panoptic Segmentation	Cityscapes val	Mask R-CNN+COCO	PQth	54.0	# 19
Object Detection	COCO minival	Mask R-CNN (ResNet-50-FPN)	box AP	37.7	# 188
Object Detection	COCO minival	Mask R-CNN (ResNet-101-FPN)	box AP	40.0	# 170
Object Detection	COCO minival	Mask R-CNN (ResNeXt-101-FPN)	box AP	36.7	# 190
Object Detection	COCO minival	Mask R-CNN (ResNeXt-101-FPN)	AP50	59.5	# 80
Object Detection	COCO minival	Mask R-CNN (ResNeXt-101-FPN)	AP75	38.9	# 91
Object Detection	COCO-O	Mask R-CNN (ResNet-50)	Average mAP	17.1	# 35
Object Detection	COCO-O	Mask R-CNN (ResNet-50)	Effective Robustness	-0.11	# 37
Keypoint Detection	COCO test-challenge	Mask R-CNN*	AR	75.4	# 5
Keypoint Detection	COCO test-challenge	Mask R-CNN*	ARM	70.2	# 5
Keypoint Detection	COCO test-challenge	Mask R-CNN*	AP	68.9	# 6
Keypoint Detection	COCO test-challenge	Mask R-CNN*	AP50	89.2	# 5
Keypoint Detection	COCO test-challenge	Mask R-CNN*	AP75	75.2	# 5
Keypoint Detection	COCO test-challenge	Mask R-CNN*	APL	82.6	# 4
Keypoint Detection	COCO test-challenge	Mask R-CNN*	AR50	93.2	# 5
Keypoint Detection	COCO test-challenge	Mask R-CNN*	AR75	81.2	# 5
Keypoint Detection	COCO test-challenge	Mask R-CNN*	ARL	76.8	# 5
Object Detection	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	box mAP	39.8	# 182
Object Detection	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	AP50	62.3	# 102
Object Detection	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	AP75	43.4	# 128
Object Detection	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	APS	22.1	# 116
Object Detection	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	APM	43.2	# 116
Object Detection	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	APL	51.2	# 122
Object Detection	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	Hardware Burden	9G	# 1
Object Detection	COCO test-dev	Mask R-CNN (ResNet-101-FPN)	box mAP	38.2	# 195
Object Detection	COCO test-dev	Mask R-CNN (ResNet-101-FPN)	AP50	60.3	# 121
Object Detection	COCO test-dev	Mask R-CNN (ResNet-101-FPN)	AP75	41.7	# 136
Object Detection	COCO test-dev	Mask R-CNN (ResNet-101-FPN)	APS	20.1	# 130
Object Detection	COCO test-dev	Mask R-CNN (ResNet-101-FPN)	APM	41.1	# 128
Object Detection	COCO test-dev	Mask R-CNN (ResNet-101-FPN)	APL	50.2	# 131
Object Detection	COCO test-dev	Mask R-CNN (ResNet-101-FPN)	Hardware Burden	9G	# 1
Instance Segmentation	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	mask AP	37.1	# 91
Instance Segmentation	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	AP50	60.0	# 30
Instance Segmentation	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	AP75	39.4	# 29
Instance Segmentation	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	APS	16.9	# 34
Instance Segmentation	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	APM	39.9	# 31
Instance Segmentation	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	APL	53.5	# 22
Keypoint Detection	COCO test-dev	Mask R-CNN	APL	71.4	# 13
Keypoint Detection	COCO test-dev	Mask R-CNN	APM	57.8	# 14
Keypoint Detection	COCO test-dev	Mask R-CNN	AP50	87.3	# 9
Keypoint Detection	COCO test-dev	Mask R-CNN	AP75	68.7	# 11
Pose Estimation	COCO test-dev	Mask-RCNN	AP	63.1	# 42
Pose Estimation	COCO test-dev	Mask-RCNN	AP50	87.3	# 35
Pose Estimation	COCO test-dev	Mask-RCNN	AP75	68.7	# 39
Pose Estimation	COCO test-dev	Mask-RCNN	APL	71.4	# 33
Multi-Person Pose Estimation	CrowdPose	Mask R-CNN	mAP @0.5:0.95	57.2	# 21
Multi-Person Pose Estimation	CrowdPose	Mask R-CNN	AP Easy	69.4	# 17
Multi-Person Pose Estimation	CrowdPose	Mask R-CNN	AP Medium	57.9	# 19
Multi-Person Pose Estimation	CrowdPose	Mask R-CNN	AP Hard	45.8	# 17
Keypoint Estimation	GRIT	Mask R-CNN	Keypoint (ablation)	70.8	# 1
Keypoint Estimation	GRIT	Mask R-CNN	Keypoint (test)	70.6	# 1
Object Localization	GRIT	Mask R-CNN	Localization (ablation)	44.7	# 3
Object Localization	GRIT	Mask R-CNN	Localization (test)	45.1	# 3
Object Segmentation	GRIT	Mask R-CNN	Segmentation (ablation)	26.2	# 2
Object Segmentation	GRIT	Mask R-CNN	Segmentation (test)	26.2	# 2
Object Detection	iSAID	Mask-RCNN+	Average Precision	37.18	# 4
Object Detection	iSAID	Mask-RCNN	Average Precision	36.50	# 5
Instance Segmentation	iSAID	Mask-RCNN	Average Precision	36.50	# 4
Instance Segmentation	iSAID	Mask-RCNN+	Average Precision	37.18	# 3
Multi-tissue Nucleus Segmentation	Kumar	Mask R-CNN (e)	Dice	0.760	# 17
Multi-tissue Nucleus Segmentation	Kumar	Mask R-CNN (e)	Hausdorff Distance (mm)	50.9	# 12
Multi-Human Parsing	MHP v1.0	Mask R-CNN	AP 0.5	52.68%	# 2
Multi-Human Parsing	MHP v2.0	Mask R-CNN	AP 0.5	14.9%	# 5
Real-Time Object Detection	MS COCO	Mask R-CNN X-152-32x8d	box AP	45.2	# 40
Keypoint Detection	MS COCO	Mask R-CNN	Validation AP	69.2	# 12
Keypoint Detection	MS COCO	Mask R-CNN	Test AP	63.1	# 15
Multi-Person Pose Estimation	OCHuman	Mask R-CNN	Validation AP	20.2	# 7
Multi-Person Pose Estimation	OCHuman	Mask R-CNN	AP50	33.2	# 8
Multi-Person Pose Estimation	OCHuman	Mask R-CNN	AP75	24.5	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/keypoint-estimation-on-grit)](https://paperswithcode.com/sota/keypoint-estimation-on-grit?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/nuclear-segmentation-on-cell17)](https://paperswithcode.com/sota/nuclear-segmentation-on-cell17?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/object-segmentation-on-grit)](https://paperswithcode.com/sota/object-segmentation-on-grit?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/multi-human-parsing-on-mhp-v10)](https://paperswithcode.com/sota/multi-human-parsing-on-mhp-v10?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/instance-segmentation-on-bdd100k-val)](https://paperswithcode.com/sota/instance-segmentation-on-bdd100k-val?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/object-localization-on-grit)](https://paperswithcode.com/sota/object-localization-on-grit?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/instance-segmentation-on-isaid)](https://paperswithcode.com/sota/instance-segmentation-on-isaid?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/object-detection-on-isaid)](https://paperswithcode.com/sota/object-detection-on-isaid?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/keypoint-detection-on-coco-test-challenge)](https://paperswithcode.com/sota/keypoint-detection-on-coco-test-challenge?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/multi-human-parsing-on-mhp-v20)](https://paperswithcode.com/sota/multi-human-parsing-on-mhp-v20?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/multi-person-pose-estimation-on-ochuman)](https://paperswithcode.com/sota/multi-person-pose-estimation-on-ochuman?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/keypoint-detection-on-coco-test-dev)](https://paperswithcode.com/sota/keypoint-detection-on-coco-test-dev?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/keypoint-detection-on-coco)](https://paperswithcode.com/sota/keypoint-detection-on-coco?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/multi-tissue-nucleus-segmentation-on-kumar)](https://paperswithcode.com/sota/multi-tissue-nucleus-segmentation-on-kumar?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/panoptic-segmentation-on-cityscapes-val)](https://paperswithcode.com/sota/panoptic-segmentation-on-cityscapes-val?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/multi-person-pose-estimation-on-crowdpose)](https://paperswithcode.com/sota/multi-person-pose-estimation-on-crowdpose?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/object-detection-on-coco-o)](https://paperswithcode.com/sota/object-detection-on-coco-o?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/real-time-object-detection-on-coco)](https://paperswithcode.com/sota/real-time-object-detection-on-coco?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/pose-estimation-on-coco-test-dev)](https://paperswithcode.com/sota/pose-estimation-on-coco-test-dev?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/instance-segmentation-on-coco)](https://paperswithcode.com/sota/instance-segmentation-on-coco?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/object-detection-on-coco-minival)](https://paperswithcode.com/sota/object-detection-on-coco-minival?p=mask-r-cnn)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mask-r-cnn/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=mask-r-cnn)`

Mask R-CNN

ICCV 2017 · Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick ·

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code has been made available at: https://github.com/facebookresearch/Detectron

PDF Abstract ICCV 2017 PDF ICCV 2017 Abstract

Code

Add Remove Mark official

tensorflow/models

76,582

facebookresearch/detectron2

↳ Quickstart in

Colab

28,641

facebookresearch/detectron

26,140

PaddlePaddle/PaddleDetection

12,034

facebookresearch/maskrcnn-benchmark

9,244

See all 172 implementations

Tasks

Add Remove

3D Instance Segmentation

Human Part Segmentation

Instance Segmentation

Keypoint Detection

Keypoint Estimation

Multi-Human Parsing

Multi-Person Pose Estimation

Multi-tissue Nucleus Segmentation

Nuclear Segmentation

Object

Object Detection

Object Localization

Object Segmentation

Panoptic Segmentation

Pose Estimation

Real-Time Object Detection

Segmentation

Semantic Segmentation

Datasets

Introduced in the Paper:

PRID2011

Used in the Paper:

MS COCO

Cityscapes

ScanNet

BDD100K

CrowdPose

iSAID

OCHuman

COCO-O

Kumar

MHP

GRIT

Cell

Results from the Paper

Edit

Ranked #1 on Keypoint Estimation on GRIT

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Instance Segmentation	BDD100K val	Mask R-CNN	AP	20.5	# 3	Compare
Nuclear Segmentation	Cell17	Mask R-CNN	F1-score	0.8004	# 2	Compare
			Dice	0.707	# 2	Compare
			Hausdorff	12.6723	# 2	Compare
Panoptic Segmentation	Cityscapes val	Mask R-CNN+COCO	PQth	54.0	# 19	Compare
Object Detection	COCO minival	Mask R-CNN (ResNet-50-FPN)	box AP	37.7	# 188	Compare
Object Detection	COCO minival	Mask R-CNN (ResNet-101-FPN)	box AP	40.0	# 170	Compare
Object Detection	COCO minival	Mask R-CNN (ResNeXt-101-FPN)	box AP	36.7	# 190	Compare
			AP50	59.5	# 80	Compare
			AP75	38.9	# 91	Compare
Object Detection	COCO-O	Mask R-CNN (ResNet-50)	Average mAP	17.1	# 35	Compare
Object Detection	COCO-O	Mask R-CNN (ResNet-50)	Effective Robustness	-0.11	# 37	Compare
Keypoint Detection	COCO test-challenge	Mask R-CNN*	AR	75.4	# 5	Compare
			ARM	70.2	# 5	Compare
			AP	68.9	# 6	Compare
			AP50	89.2	# 5	Compare
			AP75	75.2	# 5	Compare
			APL	82.6	# 4	Compare
			AR50	93.2	# 5	Compare
			AR75	81.2	# 5	Compare
			ARL	76.8	# 5	Compare
Object Detection	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	box mAP	39.8	# 182	Compare
			AP50	62.3	# 102	Compare
			AP75	43.4	# 128	Compare
			APS	22.1	# 116	Compare
			APM	43.2	# 116	Compare
			APL	51.2	# 122	Compare
			Hardware Burden	9G	# 1	Compare
Object Detection	COCO test-dev	Mask R-CNN (ResNet-101-FPN)	box mAP	38.2	# 195	Compare
			AP50	60.3	# 121	Compare
			AP75	41.7	# 136	Compare
			APS	20.1	# 130	Compare
			APM	41.1	# 128	Compare
			APL	50.2	# 131	Compare
			Hardware Burden	9G	# 1	Compare
Instance Segmentation	COCO test-dev	Mask R-CNN (ResNeXt-101-FPN)	mask AP	37.1	# 91	Compare
			AP50	60.0	# 30	Compare
			AP75	39.4	# 29	Compare
			APS	16.9	# 34	Compare
			APM	39.9	# 31	Compare
			APL	53.5	# 22	Compare
Keypoint Detection	COCO test-dev	Mask R-CNN	APL	71.4	# 13	Compare
			APM	57.8	# 14	Compare
			AP50	87.3	# 9	Compare
			AP75	68.7	# 11	Compare
Pose Estimation	COCO test-dev	Mask-RCNN	AP	63.1	# 42	Compare
			AP50	87.3	# 35	Compare
			AP75	68.7	# 39	Compare
			APL	71.4	# 33	Compare
Multi-Person Pose Estimation	CrowdPose	Mask R-CNN	mAP @0.5:0.95	57.2	# 21	Compare
			AP Easy	69.4	# 17	Compare
			AP Medium	57.9	# 19	Compare
			AP Hard	45.8	# 17	Compare
Keypoint Estimation	GRIT	Mask R-CNN	Keypoint (ablation)	70.8	# 1	Compare
Keypoint Estimation	GRIT	Mask R-CNN	Keypoint (test)	70.6	# 1	Compare
Object Localization	GRIT	Mask R-CNN	Localization (ablation)	44.7	# 3	Compare
Object Localization	GRIT	Mask R-CNN	Localization (test)	45.1	# 3	Compare
Object Segmentation	GRIT	Mask R-CNN	Segmentation (ablation)	26.2	# 2	Compare
Object Segmentation	GRIT	Mask R-CNN	Segmentation (test)	26.2	# 2	Compare
Object Detection	iSAID	Mask-RCNN+	Average Precision	37.18	# 4	Compare
Object Detection	iSAID	Mask-RCNN	Average Precision	36.50	# 5	Compare
Instance Segmentation	iSAID	Mask-RCNN	Average Precision	36.50	# 4	Compare
Instance Segmentation	iSAID	Mask-RCNN+	Average Precision	37.18	# 3	Compare
Multi-tissue Nucleus Segmentation	Kumar	Mask R-CNN (e)	Dice	0.760	# 17	Compare
Multi-tissue Nucleus Segmentation	Kumar	Mask R-CNN (e)	Hausdorff Distance (mm)	50.9	# 12	Compare
Multi-Human Parsing	MHP v1.0	Mask R-CNN	AP 0.5	52.68%	# 2	Compare
Multi-Human Parsing	MHP v2.0	Mask R-CNN	AP 0.5	14.9%	# 5	Compare
Real-Time Object Detection	MS COCO	Mask R-CNN X-152-32x8d	box AP	45.2	# 40	Compare
Keypoint Detection	MS COCO	Mask R-CNN	Validation AP	69.2	# 12	Compare
Keypoint Detection	MS COCO	Mask R-CNN	Test AP	63.1	# 15	Compare
Multi-Person Pose Estimation	OCHuman	Mask R-CNN	Validation AP	20.2	# 7	Compare
			AP50	33.2	# 8	Compare
			AP75	24.5	# 8	Compare

Methods

Add Remove

1x1 Convolution • Average Pooling • Batch Normalization • Bottleneck Residual Block • Convolution • Global Average Pooling • Grouped Convolution • Kaiming Initialization • Mask R-CNN • Max Pooling • ReLU • Residual Block • Residual Connection • ResNet • ResNeXt • ResNeXt Block • RoIAlign • RPN • Softmax

Edit Social Preview

Mask R-CNN

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove