TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Monocular Depth Estimation	NYU-Depth V2	ZoeD-M12-N	RMSE	0.270	# 13
Monocular Depth Estimation	NYU-Depth V2	ZoeD-M12-N	absolute relative error	0.075	# 15
Monocular Depth Estimation	NYU-Depth V2	ZoeD-M12-N	Delta < 1.25	0.955	# 13
Monocular Depth Estimation	NYU-Depth V2	ZoeD-M12-N	Delta < 1.25^2	0.995	# 11
Monocular Depth Estimation	NYU-Depth V2	ZoeD-M12-N	Delta < 1.25^3	0.999	# 4
Monocular Depth Estimation	NYU-Depth V2	ZoeD-M12-N	log 10	0.032	# 14

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoedepth-zero-shot-transfer-by-combining/monocular-depth-estimation-on-nyu-depth-v2)](https://paperswithcode.com/sota/monocular-depth-estimation-on-nyu-depth-v2?p=zoedepth-zero-shot-transfer-by-combining)`

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

23 Feb 2023 · Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, Matthias Müller ·

This paper tackles the problem of depth estimation from a single image. Existing work either focuses on generalization performance disregarding metric scale, i.e. relative depth estimation, or state-of-the-art results on specific datasets, i.e. metric depth estimation. We propose the first approach that combines both worlds, leading to a model with excellent generalization performance while maintaining metric scale. Our flagship model, ZoeD-M12-NK, is pre-trained on 12 datasets using relative depth and fine-tuned on two datasets using metric depth. We use a lightweight head with a novel bin adjustment design called metric bins module for each domain. During inference, each input image is automatically routed to the appropriate head using a latent classifier. Our framework admits multiple configurations depending on the datasets used for relative depth pre-training and metric fine-tuning. Without pre-training, we can already significantly improve the state of the art (SOTA) on the NYU Depth v2 indoor dataset. Pre-training on twelve datasets and fine-tuning on the NYU Depth v2 indoor dataset, we can further improve SOTA for a total of 21% in terms of relative absolute error (REL). Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains. The code and pre-trained models are publicly available at https://github.com/isl-org/ZoeDepth .

PDF Abstract

Code

Add Remove Mark official

isl-org/ZoeDepth official

↳ Quickstart in

Colab

Spaces

1,964

isl-org/MiDaS

↳ Quickstart in

Spaces

4,099

intel-isl/MiDaS

↳ Quickstart in

Spaces

4,098

Tasks

Add Remove

Depth Estimation

Monocular Depth Estimation

Zero-shot Generalization

Datasets

KITTI

NYUv2

SUN RGB-D

Hypersim

DIODE

DDAD

Virtual KITTI 2

IBims-1

DIML/CVl RGB-D Dataset

Results from the Paper

Edit

Ranked #13 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Monocular Depth Estimation	NYU-Depth V2	ZoeD-M12-N	RMSE	0.270	# 13	Compare
			absolute relative error	0.075	# 15	Compare
			Delta < 1.25	0.955	# 13	Compare
			Delta < 1.25^2	0.995	# 11	Compare
			Delta < 1.25^3	0.999	# 4	Compare
			log 10	0.032	# 14	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • AdaptiveBins • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove