TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Object Detection	DAIR-V2X-I	CoBEV	AP\|R40(moderate)	69.6	# 2
3D Object Detection	DAIR-V2X-I	CoBEV	AP\|R40(easy)	82.0	# 2
3D Object Detection	DAIR-V2X-I	CoBEV	AP\|R40(hard)	69.7	# 2
3D Object Detection	Rope3D	CoBEV	AP@0.7	52.72	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cobev-elevating-roadside-3d-object-detection/3d-object-detection-on-dair-v2x-i)](https://paperswithcode.com/sota/3d-object-detection-on-dair-v2x-i?p=cobev-elevating-roadside-3d-object-detection)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/cobev-elevating-roadside-3d-object-detection/3d-object-detection-on-rope3d)](https://paperswithcode.com/sota/3d-object-detection-on-rope3d?p=cobev-elevating-roadside-3d-object-detection)`

CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

4 Oct 2023 · Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang ·

Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety. While previous studies have limitations in using only depth or height information, we find both depth and height matter and they are in fact complementary. The depth feature encompasses precise geometric cues, whereas the height feature is primarily focused on distinguishing between various categories of height intervals, essentially providing semantic context. This insight motivates the development of Complementary-BEV (CoBEV), a novel end-to-end monocular 3D object detection framework that integrates depth and height to construct robust BEV representations. In essence, CoBEV estimates each pixel's depth and height distribution and lifts the camera features into 3D space for lateral fusion using the newly proposed two-stage complementary feature selection (CFS) module. A BEV feature distillation framework is also seamlessly integrated to further enhance the detection accuracy from the prior knowledge of the fusion-modal CoBEV teacher. We conduct extensive experiments on the public 3D detection benchmarks of roadside camera-based DAIR-V2X-I and Rope3D, as well as the private Supremind-Road dataset, demonstrating that CoBEV not only achieves the accuracy of the new state-of-the-art, but also significantly advances the robustness of previous methods in challenging long-distance scenarios and noisy camera disturbance, and enhances generalization by a large margin in heterologous settings with drastic changes in scene and camera parameters. For the first time, the vehicle AP score of a camera model reaches 80% on DAIR-V2X-I in terms of easy mode. The source code will be made publicly available at https://github.com/MasterHow/CoBEV.

PDF Abstract

Code

Add Remove Mark official

MasterHow/CoBEV official

Tasks

Add Remove

3D Object Detection

feature selection

Monocular 3D Object Detection

object-detection

Object Detection

Datasets

DAIR-V2X Rope3D

Results from the Paper

Edit

Ranked #2 on 3D Object Detection on Rope3D

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Object Detection	DAIR-V2X-I	CoBEV	AP\|R40(moderate)	69.6	# 2	Compare
			AP\|R40(easy)	82.0	# 2	Compare
			AP\|R40(hard)	69.7	# 2	Compare
3D Object Detection	Rope3D	CoBEV	AP@0.7	52.72	# 2	Compare

Methods

Add Remove

Feature Selection

Edit Social Preview

CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove