TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Bird's-Eye View Semantic Segmentation	nuScenes	BAEFormer	IoU veh - 224x480 - No vis filter - 100x100 at 0.5	36	# 5
Bird's-Eye View Semantic Segmentation	nuScenes	BAEFormer	IoU veh - 448x800 - No vis filter - 100x100 at 0.5	37.8	# 5
Bird's-Eye View Semantic Segmentation	nuScenes	BAEFormer	IoU veh - 224x480 - Vis filter. - 100x100 at 0.5	38.9	# 6
Bird's-Eye View Semantic Segmentation	nuScenes	BAEFormer	IoU veh - 448x800 - Vis filter. - 100x100 at 0.5	41.0	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/baeformer-bi-directional-and-early/bird-s-eye-view-semantic-segmentation-on)](https://paperswithcode.com/sota/bird-s-eye-view-semantic-segmentation-on?p=baeformer-bi-directional-and-early)`

BAEFormer: Bi-Directional and Early Interaction Transformers for Bird's Eye View Semantic Segmentation

CVPR 2023 · Cong Pan, Yonghao He, Junran Peng, Qian Zhang, Wei Sui, Zhaoxiang Zhang ·

Bird's Eye View (BEV) semantic segmentation is a critical task in autonomous driving. However, existing Transformer-based methods confront difficulties in transforming Perspective View (PV) to BEV due to their unidirectional and posterior interaction mechanisms. To address this issue, we propose a novel Bi-directional and Early Interaction Transformers framework named BAEFormer, consisting of (i) an early-interaction PV-BEV pipeline and (ii) a bi-directional cross-attention mechanism. Moreover, we find that the image feature maps' resolution in the cross-attention module has a limited effect on the final performance. Under this critical observation, we propose to enlarge the size of input images and downsample the multi-view image features for cross-interaction, further improving the accuracy while keeping the amount of computation controllable. Our proposed method for BEV semantic segmentation achieves state-of-the-art performance in real-time inference speed on the nuScenes dataset, i.e., 38.9 mIoU at 45 FPS on a single A100 GPU.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Autonomous Driving

Bird's-Eye View Semantic Segmentation

Segmentation

Semantic Segmentation

Datasets

nuScenes

Results from the Paper

Add Remove

Ranked #6 on Bird's-Eye View Semantic Segmentation on nuScenes

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Bird's-Eye View Semantic Segmentation	nuScenes	BAEFormer	IoU veh - 224x480 - No vis filter - 100x100 at 0.5	36	# 5	Compare
			IoU veh - 448x800 - No vis filter - 100x100 at 0.5	37.8	# 5	Compare
			IoU veh - 224x480 - Vis filter. - 100x100 at 0.5	38.9	# 6	Compare
			IoU veh - 448x800 - Vis filter. - 100x100 at 0.5	41.0	# 5	Compare

Methods

Add Remove

Concatenated Skip Connection • Cross-Attention Module • Softmax • SPEED

Edit Social Preview

BAEFormer: Bi-Directional and Early Interaction Transformers for Bird's Eye View Semantic Segmentation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove