TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Instance Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	mask AP	46.2	# 4
Instance Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	AP50	74.2	# 1
Panoptic Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	PQ	67.7	# 8
Panoptic Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	PQst	71.5	# 2
Panoptic Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	PQth	62.5	# 3
Panoptic Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	mIoU	83.0	# 10
Panoptic Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	AP	46.2	# 5
Instance Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	mask AP	44.0	# 8
Instance Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	AP50	72.8	# 2
Panoptic Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	PQ	66.9	# 13
Panoptic Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	PQst	70.8	# 3
Panoptic Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	PQth	61.5	# 4
Panoptic Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	mIoU	82.2	# 14
Panoptic Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	AP	44.2	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/autofocusformer-image-segmentation-off-the/instance-segmentation-on-cityscapes-val)](https://paperswithcode.com/sota/instance-segmentation-on-cityscapes-val?p=autofocusformer-image-segmentation-off-the)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/autofocusformer-image-segmentation-off-the/panoptic-segmentation-on-cityscapes-val)](https://paperswithcode.com/sota/panoptic-segmentation-on-cityscapes-val?p=autofocusformer-image-segmentation-off-the)`

AutoFocusFormer: Image Segmentation off the Grid

CVPR 2023 · Chen Ziwen, Kaushik Patnaik, Shuangfei Zhai, Alvin Wan, Zhile Ren, Alex Schwing, Alex Colburn, Li Fuxin ·

Real world images often have highly imbalanced content density. Some areas are very uniform, e.g., large patches of blue sky, while other areas are scattered with many small objects. Yet, the commonly used successive grid downsampling strategy in convolutional deep networks treats all areas equally. Hence, small objects are represented in very few spatial locations, leading to worse results in tasks such as segmentation. Intuitively, retaining more pixels representing small objects during downsampling helps to preserve important information. To achieve this, we propose AutoFocusFormer (AFF), a local-attention transformer image recognition backbone, which performs adaptive downsampling by learning to retain the most important pixels for the task. Since adaptive downsampling generates a set of pixels irregularly distributed on the image plane, we abandon the classic grid structure. Instead, we develop a novel point-based local attention block, facilitated by a balanced clustering module and a learnable neighborhood merging module, which yields representations for our point-based versions of state-of-the-art segmentation heads. Experiments show that our AutoFocusFormer (AFF) improves significantly over baseline models of similar sizes.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

apple/ml-autofocusformer official

103

Tasks

Add Remove

Image Segmentation

Instance Segmentation

Panoptic Segmentation

Segmentation

Semantic Segmentation

Datasets

ImageNet

MS COCO

Cityscapes

ADE20K

Results from the Paper

Edit

Ranked #4 on Instance Segmentation on Cityscapes val

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Instance Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	mask AP	46.2	# 4	Compare
Instance Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	AP50	74.2	# 1	Compare
Panoptic Segmentation	Cityscapes val	AFF-Base (single-scale, point-based Mask2Former)	PQ	67.7	# 8	Compare
			PQst	71.5	# 2	Compare
			PQth	62.5	# 3	Compare
			mIoU	83.0	# 10	Compare
			AP	46.2	# 5	Compare
Instance Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	mask AP	44.0	# 8	Compare
Instance Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	AP50	72.8	# 2	Compare
Panoptic Segmentation	Cityscapes val	AFF-Small (single-scale, point-based Mask2Former)	PQ	66.9	# 13	Compare
			PQst	70.8	# 3	Compare
			PQth	61.5	# 4	Compare
			mIoU	82.2	# 14	Compare
			AP	44.2	# 9	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

AutoFocusFormer: Image Segmentation off the Grid

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove