Monocular Depth Estimation

338 papers with code • 18 benchmarks • 26 datasets

Monocular Depth Estimation is the task of estimating the depth value (distance relative to the camera) of each pixel given a single (monocular) RGB image. This challenging task is a key prerequisite for determining scene understanding for applications such as 3D scene reconstruction, autonomous driving, and AR. State-of-the-art methods usually fall into one of two categories: designing a complex network that is powerful enough to directly regress the depth map, or splitting the input into bins or windows to reduce computational complexity. The most popular benchmarks are the KITTI and NYUv2 datasets. Models are typically evaluated using RMSE or absolute relative error.

Source: Defocus Deblurring Using Dual-Pixel Data

Benchmarks

Add a Result

These leaderboards are used to track progress in Monocular Depth Estimation

Dataset	Best Model	Compare
NYU-Depth V2	UniDepth (Zero-shot)	See all
KITTI Eigen split	LightedDepth (Video Method)	See all
KITTI Eigen split unsupervised	SQLdepth (ConvNeXt-L)	See all
NYU-Depth V2 self-supervised	IndoorDepth	See all
Mid-Air Dataset	M4Depth+U	See all
Make3D	GCNDepth	See all
IBims-1	LeReS	See all
DDAD	AFNet	See all
VA (Virtual Apartment)	DistDepth	See all
Middlebury 2014	Miangoleh et al. (MiDaS)	See all
KITTI	MonoViT	See all
SUN-RGBD	RPSF	See all
Cityscapes	SwinMTL	See all
UASOL	FCRN-DepthPrediction from Iro Laina et al. (2016)	See all
KITTI Object Tracking Evaluation 2012	PackNet-SfM	See all
Matterport3D	NeWCRFs	See all
Cityscapes 3D	TaskPrompter	See all

Show all 18 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Monocular Depth Estimation models and implementations

huggingface/transformers

3 papers

124,593

SeokjuLee/Insta-DM

3 papers

220

ShuweiShao/NDDepth

3 papers

Datasets

Latest papers

Most implemented Social Latest No code

Virtually Enriched NYU Depth V2 Dataset for Monocular Depth Estimation: Do We Need Artificial Augmentation?

abrain-one/anyu • 15 Apr 2024

We present ANYU, a new virtually augmented version of the NYU depth v2 dataset, designed for monocular depth estimation.

15 Apr 2024

Paper
Code

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

agneet42/robustness_depth_lang • • 12 Apr 2024

Recent advances in monocular depth estimation have been made by incorporating natural language as additional guidance.

12 Apr 2024

Paper
Code

RoadBEV: Road Surface Reconstruction in Bird's Eye View

ztsrxh/roadbev • • 9 Apr 2024

This paper uniformly proposes two simple yet effective models for road elevation reconstruction in BEV named RoadBEV-mono and RoadBEV-stereo, which estimate road elevation with monocular and stereo images, respectively.

09 Apr 2024

Paper
Code

WorDepth: Variational Language Prior for Monocular Depth Estimation

adonis-galaxy/wordepth • • 4 Apr 2024

To test this, we focus on monocular depth estimation, the problem of predicting a dense depth map from a single image, but with an additional text caption describing the scene.

04 Apr 2024

Paper
Code

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

skmhrk1209/VSRD • • 2 Apr 2024

In the auto-labeling stage, we represent the surface of each instance as a signed distance field (SDF) and render its silhouette as an instance mask through our proposed instance-aware volumetric silhouette rendering.

02 Apr 2024

Paper
Code

UniDepth: Universal Monocular Metric Depth Estimation

lpiccinelli-eth/unidepth • • 27 Mar 2024

However, the remarkable accuracy of recent MMDE methods is confined to their training domains.

296

27 Mar 2024

Paper
Code

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation

aradhye2002/ecodepth • • 27 Mar 2024

We argue that the embedding vector from a ViT model, pre-trained on a large dataset, captures greater relevant information for SIDE than the usual route of generating pseudo image captions, followed by CLIP based text embeddings.

27 Mar 2024

Paper
Code

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

gandolfczjh/3d2fool • 26 Mar 2024

Deep learning-based monocular depth estimation (MDE), extensively applied in autonomous driving, is known to be vulnerable to adversarial attacks.

26 Mar 2024

Paper
Code

SwinMTL: A Shared Architecture for Simultaneous Depth Estimation and Semantic Segmentation from Monocular Camera Images

pardistaghavi/swinmtl • • 15 Mar 2024

This research paper presents an innovative multi-task learning framework that allows concurrent depth estimation and semantic segmentation using a single camera.

15 Mar 2024

Paper
Code

METER: a mobile vision transformer architecture for monocular depth estimation

lorenzopapa5/meter • • 13 Mar 2024

State of the art MDE models typically rely on vision transformers (ViT) architectures that are highly deep and complex, making them unsuitable for fast inference on devices with hardware constraints.

13 Mar 2024

Paper
Code

Monocular Depth Estimation

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result