TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Vehicle Pose Estimation	KITTI Cars Hard	ML-Fusion	Average Orientation Similarity	76.37	# 12

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-level-fusion-based-3d-object-detection/vehicle-pose-estimation-on-kitti-cars-hard)](https://paperswithcode.com/sota/vehicle-pose-estimation-on-kitti-cars-hard?p=multi-level-fusion-based-3d-object-detection)`

Multi-Level Fusion Based 3D Object Detection From Monocular Images

CVPR 2018 · Bin Xu, Zhenzhong Chen ·

In this paper, we present an end-to-end deep learning based framework for 3D object detection from a single monocular image. A deep convolutional neural network is introduced for simultaneous 2D and 3D object detection. First, 2D region proposals are generated through a region proposal network. Then the shared features are learned within the proposals to predict the class probability, 2D bounding box, orientation, dimension, and 3D location. We adopt a stand-alone module to predict the disparity and extract features from the computed point cloud. Thus features from the original image and the point cloud will be fused in different levels for accurate 3D localization. The estimated disparity is also used for front view feature encoding to enhance the input image,regarded as an input-fusionprocess. The proposed algorithm can directly output both 2D and 3D object detection results in an end-to-end fashion with only a single RGB image as the input. The experimental results on the challenging KITTI benchmark demonstrate that our algorithm signiï¬cantly outperforms the state-of-the-art methods with only monocular images.

PDF Abstract