TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Monocular 3D Object Detection	KITTI Cars Easy	CIE	AP Easy	31.55	# 1
Monocular 3D Object Detection	KITTI Cars Hard	CIE	AP Hard	17.83	# 1
Monocular 3D Object Detection	KITTI Cars Moderate	CIE	AP Medium	20.95	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/consistency-of-implicit-and-explicit-features/monocular-3d-object-detection-on-kitti-cars-2)](https://paperswithcode.com/sota/monocular-3d-object-detection-on-kitti-cars-2?p=consistency-of-implicit-and-explicit-features)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/consistency-of-implicit-and-explicit-features/monocular-3d-object-detection-on-kitti-cars-1)](https://paperswithcode.com/sota/monocular-3d-object-detection-on-kitti-cars-1?p=consistency-of-implicit-and-explicit-features)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/consistency-of-implicit-and-explicit-features/monocular-3d-object-detection-on-kitti-cars)](https://paperswithcode.com/sota/monocular-3d-object-detection-on-kitti-cars?p=consistency-of-implicit-and-explicit-features)`

Consistency of Implicit and Explicit Features Matters for Monocular 3D Object Detection

16 Jul 2022 · Qian Ye, Ling Jiang, Wang Zhen, Yuyang Du ·

Low-cost autonomous agents including autonomous driving vehicles chiefly adopt monocular 3D object detection to perceive surrounding environment. This paper studies 3D intermediate representation methods which generate intermediate 3D features for subsequent tasks. For example, the 3D features can be taken as input for not only detection, but also end-to-end prediction and/or planning that require a bird's-eye-view feature representation. In the study, we found that in generating 3D representation previous methods do not maintain the consistency between the objects' implicit poses in the latent space, especially orientations, and the explicitly observed poses in the Euclidean space, which can substantially hurt model performance. To tackle this problem, we present a novel monocular detection method, the first one being aware of the poses to purposefully guarantee that they are consistent between the implicit and explicit features. Additionally, we introduce a local ray attention mechanism to efficiently transform image features to voxels at accurate 3D locations. Thirdly, we propose a handcrafted Gaussian positional encoding function, which outperforms the sinusoidal encoding function while retaining the benefit of being continuous. Results show that our method improves the state-of-the-art 3D intermediate representation method by 3.15%. We are ranked 1st among all the reported monocular methods on both 3D and BEV detection benchmark on KITTI leaderboard as of th result's submission time.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

3D Object Detection

Autonomous Driving

Monocular 3D Object Detection

object-detection

Object Detection

Datasets

KITTI

Results from the Paper

Edit

Ranked #1 on Monocular 3D Object Detection on KITTI Cars Moderate (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Monocular 3D Object Detection	KITTI Cars Easy	CIE	AP Easy	31.55	# 1	Compare
Monocular 3D Object Detection	KITTI Cars Hard	CIE	AP Hard	17.83	# 1	Compare
Monocular 3D Object Detection	KITTI Cars Moderate	CIE	AP Medium	20.95	# 1	Compare

Methods

Add Remove

AWARE

Edit Social Preview

Consistency of Implicit and Explicit Features Matters for Monocular 3D Object Detection

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove