TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Object Detection	ScanNetV2	V-DETR	mAP@0.25	77.8	# 2
3D Object Detection	ScanNetV2	V-DETR	mAP@0.5	65.9	# 1
3D Object Detection	SUN-RGBD val	V-DETR	mAP@0.25	68.0	# 3
3D Object Detection	SUN-RGBD val	V-DETR	mAP@0.5	51.1	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/v-detr-detr-with-vertex-relative-position/3d-object-detection-on-scannetv2)](https://paperswithcode.com/sota/3d-object-detection-on-scannetv2?p=v-detr-detr-with-vertex-relative-position)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/v-detr-detr-with-vertex-relative-position/3d-object-detection-on-sun-rgbd-val)](https://paperswithcode.com/sota/3d-object-detection-on-sun-rgbd-val?p=v-detr-detr-with-vertex-relative-position)`

V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection

8 Aug 2023 · Yichao Shen, Zigang Geng, Yuhui Yuan, Yutong Lin, Ze Liu, Chunyu Wang, Han Hu, Nanning Zheng, Baining Guo ·

We introduce a highly performant 3D object detector for point clouds using the DETR framework. The prior attempts all end up with suboptimal results because they fail to learn accurate inductive biases from the limited scale of training data. In particular, the queries often attend to points that are far away from the target objects, violating the locality principle in object detection. To address the limitation, we introduce a novel 3D Vertex Relative Position Encoding (3DV-RPE) method which computes position encoding for each point based on its relative position to the 3D boxes predicted by the queries in each decoder layer, thus providing clear information to guide the model to focus on points near the objects, in accordance with the principle of locality. In addition, we systematically improve the pipeline from various aspects such as data normalization based on our understanding of the task. We show exceptional results on the challenging ScanNetV2 benchmark, achieving significant improvements over the previous 3DETR in $\rm{AP}_{25}$/$\rm{AP}_{50}$ from 65.0\%/47.0\% to 77.8\%/66.0\%, respectively. In addition, our method sets a new record on ScanNetV2 and SUN RGB-D datasets.Code will be released at http://github.com/yichaoshen-MS/V-DETR.

PDF Abstract

Code

Add Remove Mark official

yichaoshen-ms/v-detr official

Tasks

Add Remove

3D Object Detection

object-detection

Object Detection

Position

Datasets

ScanNet

SUN RGB-D

Results from the Paper

Edit

Ranked #2 on 3D Object Detection on ScanNetV2

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Object Detection	ScanNetV2	V-DETR	mAP@0.25	77.8	# 2	Compare
3D Object Detection	ScanNetV2	V-DETR	mAP@0.5	65.9	# 1	Compare
3D Object Detection	SUN-RGBD val	V-DETR	mAP@0.25	68.0	# 3	Compare
3D Object Detection	SUN-RGBD val	V-DETR	mAP@0.5	51.1	# 4	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Convolution • Dense Connections • Detr • Dropout • fail • Feedforward Network • Focus • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove