TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Monocular Depth Estimation	KITTI Eigen split	Metric3D (zero-shot)	absolute relative error	0.058	# 29
Monocular Depth Estimation	KITTI Eigen split	Metric3D (zero-shot)	RMSE	2.77	# 35
Monocular Depth Estimation	KITTI Eigen split	Metric3D (zero-shot)	Delta < 1.25	0.967	# 26
Monocular Depth Estimation	KITTI Eigen split	Metric3D (zero-shot)	Delta < 1.25^2	0.995	# 28
Monocular Depth Estimation	KITTI Eigen split	Metric3D (zero-shot)	Delta < 1.25^3	0.999	# 11
Monocular Depth Estimation	NYU-Depth V2	Metric3D (ConvNeXt-Large, Zero-shot testing)	RMSE	0.310	# 19
Monocular Depth Estimation	NYU-Depth V2	Metric3D (ConvNeXt-Large, Zero-shot testing)	absolute relative error	0.083	# 18
Monocular Depth Estimation	NYU-Depth V2	Metric3D (ConvNeXt-Large, Zero-shot testing)	Delta < 1.25	0.944	# 19
Monocular Depth Estimation	NYU-Depth V2	Metric3D (ConvNeXt-Large, Zero-shot testing)	Delta < 1.25^2	0.986	# 34
Monocular Depth Estimation	NYU-Depth V2	Metric3D (ConvNeXt-Large, Zero-shot testing)	Delta < 1.25^3	0.995	# 41
Monocular Depth Estimation	NYU-Depth V2	Metric3D (ConvNeXt-Large, Zero-shot testing)	log 10	0.035	# 18

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/metric3d-towards-zero-shot-metric-3d/monocular-depth-estimation-on-nyu-depth-v2)](https://paperswithcode.com/sota/monocular-depth-estimation-on-nyu-depth-v2?p=metric3d-towards-zero-shot-metric-3d)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/metric3d-towards-zero-shot-metric-3d/monocular-depth-estimation-on-kitti-eigen)](https://paperswithcode.com/sota/monocular-depth-estimation-on-kitti-eigen?p=metric3d-towards-zero-shot-metric-3d)`

Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image

ICCV 2023 · Wei Yin, Chi Zhang, Hao Chen, Zhipeng Cai, Gang Yu, Kaixuan Wang, Xiaozhi Chen, Chunhua Shen ·

Reconstructing accurate 3D scenes from images is a long-standing vision task. Due to the ill-posedness of the single-image reconstruction problem, most well-established methods are built upon multi-view geometry. State-of-the-art (SOTA) monocular metric depth estimation methods can only handle a single camera model and are unable to perform mixed-data training due to the metric ambiguity. Meanwhile, SOTA monocular methods trained on large mixed datasets achieve zero-shot generalization by learning affine-invariant depths, which cannot recover real-world metrics. In this work, we show that the key to a zero-shot single-view metric depth model lies in the combination of large-scale data training and resolving the metric ambiguity from various camera models. We propose a canonical camera space transformation module, which explicitly addresses the ambiguity problems and can be effortlessly plugged into existing monocular models. Equipped with our module, monocular models can be stably trained with over 8 million images with thousands of camera models, resulting in zero-shot generalization to in-the-wild images with unseen camera settings. Experiments demonstrate SOTA performance of our method on 7 zero-shot benchmarks. Notably, our method won the championship in the 2nd Monocular Depth Estimation Challenge. Our method enables the accurate recovery of metric 3D structures on randomly collected internet images, paving the way for plausible single-image metrology. The potential benefits extend to downstream tasks, which can be significantly improved by simply plugging in our model. For example, our model relieves the scale drift issues of monocular-SLAM (Fig. 1), leading to high-quality metric scale dense mapping. The code is available at https://github.com/YvanYin/Metric3D.

PDF Abstract ICCV 2023 PDF ICCV 2023 Abstract

Code

Add Remove Mark official

yvanyin/metric3d official

↳ Quickstart in

Spaces

685

Tasks

Add Remove

Depth Estimation

Image Reconstruction

Monocular Depth Estimation

Zero-shot Generalization

Datasets

Cityscapes

KITTI

nuScenes

ScanNet

NYUv2

Taskonomy

DIODE

DDAD

DrivingStereo

PandaSet

IBims-1

Results from the Paper

Edit

Ranked #19 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Monocular Depth Estimation	KITTI Eigen split	Metric3D (zero-shot)	absolute relative error	0.058	# 29	Compare
			RMSE	2.77	# 35	Compare
			Delta < 1.25	0.967	# 26	Compare
			Delta < 1.25^2	0.995	# 28	Compare
			Delta < 1.25^3	0.999	# 11	Compare
Monocular Depth Estimation	NYU-Depth V2	Metric3D (ConvNeXt-Large, Zero-shot testing)	RMSE	0.310	# 19	Compare
			absolute relative error	0.083	# 18	Compare
			Delta < 1.25	0.944	# 19	Compare
			Delta < 1.25^2	0.986	# 34	Compare
			Delta < 1.25^3	0.995	# 41	Compare
			log 10	0.035	# 18	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove