TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Object Detection	COCO-O	DETA (Swin-L)	Average mAP	48.5	# 2
Object Detection	COCO-O	DETA (Swin-L)	Effective Robustness	20.15	# 3
Object Detection	COCO test-dev	DETA (Swin-L)	box mAP	63.5	# 14
Object Detection	COCO test-dev	DETA (Swin-L)	AP50	80.4	# 3
Object Detection	COCO test-dev	DETA (Swin-L)	AP75	70.2	# 3
Object Detection	COCO test-dev	DETA (Swin-L)	APS	46.1	# 3
Object Detection	COCO test-dev	DETA (Swin-L)	APM	66.9	# 3
Object Detection	COCO test-dev	DETA (Swin-L)	APL	76.9	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/nms-strikes-back/object-detection-on-coco-o)](https://paperswithcode.com/sota/object-detection-on-coco-o?p=nms-strikes-back)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/nms-strikes-back/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=nms-strikes-back)`

NMS Strikes Back

12 Dec 2022 · Jeffrey Ouyang-Zhang, Jang Hyun Cho, Xingyi Zhou, Philipp Krähenbühl ·

Detection Transformer (DETR) directly transforms queries to unique objects by using one-to-one bipartite matching during training and enables end-to-end object detection. Recently, these models have surpassed traditional detectors on COCO with undeniable elegance. However, they differ from traditional detectors in multiple designs, including model architecture and training schedules, and thus the effectiveness of one-to-one matching is not fully understood. In this work, we conduct a strict comparison between the one-to-one Hungarian matching in DETRs and the one-to-many label assignments in traditional detectors with non-maximum supervision (NMS). Surprisingly, we observe one-to-many assignments with NMS consistently outperform standard one-to-one matching under the same setting, with a significant gain of up to 2.5 mAP. Our detector that trains Deformable-DETR with traditional IoU-based label assignment achieved 50.2 COCO mAP within 12 epochs (1x schedule) with ResNet50 backbone, outperforming all existing traditional or transformer-based detectors in this setting. On multiple datasets, schedules, and architectures, we consistently show bipartite matching is unnecessary for performant detection transformers. Furthermore, we attribute the success of detection transformers to their expressive transformer architecture. Code is available at https://github.com/jozhang97/DETA.

PDF Abstract

Code

Add Remove Mark official

jozhang97/deta official

234

Tasks

Add Remove

Attribute

object-detection

Object Detection

Datasets

MS COCO

LVIS

COCO-O

Results from the Paper

Edit

Ranked #2 on Object Detection on COCO-O (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Object Detection	COCO-O	DETA (Swin-L)	Average mAP	48.5	# 2	Compare
Object Detection	COCO-O	DETA (Swin-L)	Effective Robustness	20.15	# 3	Compare
Object Detection	COCO test-dev	DETA (Swin-L)	box mAP	63.5	# 14	Compare
			AP50	80.4	# 3	Compare
			AP75	70.2	# 3	Compare
			APS	46.1	# 3	Compare
			APM	66.9	# 3	Compare
			APL	76.9	# 3	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

NMS Strikes Back

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove