Monocular Depth Estimation

339 papers with code • 17 benchmarks • 27 datasets

Monocular Depth Estimation is the task of estimating the depth value (distance relative to the camera) of each pixel given a single (monocular) RGB image. This challenging task is a key prerequisite for determining scene understanding for applications such as 3D scene reconstruction, autonomous driving, and AR. State-of-the-art methods usually fall into one of two categories: designing a complex network that is powerful enough to directly regress the depth map, or splitting the input into bins or windows to reduce computational complexity. The most popular benchmarks are the KITTI and NYUv2 datasets. Models are typically evaluated using RMSE or absolute relative error.

Source: Defocus Deblurring Using Dual-Pixel Data

Benchmarks

Add a Result

These leaderboards are used to track progress in Monocular Depth Estimation

Dataset	Best Model	Compare
NYU-Depth V2	Metric3Dv2(L, FT)	See all
KITTI Eigen split	Metric3Dv2 (g2, FT, 80m, flip_aug_test)	See all
KITTI Eigen split unsupervised	SQLdepth (ConvNeXt-L)	See all
NYU-Depth V2 self-supervised	IndoorDepth	See all
Mid-Air Dataset	M4Depth+U	See all
Make3D	GCNDepth	See all
IBims-1	Miangoleh et al. (SGR)	See all
DDAD	AFNet	See all
VA (Virtual Apartment)	DistDepth	See all
Middlebury 2014	Miangoleh et al. (MiDaS)	See all
KITTI	MonoViT	See all
SUN-RGBD	RPSF	See all
Cityscapes	SwinMTL	See all
UASOL	FCRN-DepthPrediction from Iro Laina et al. (2016)	See all
KITTI Object Tracking Evaluation 2012	PackNet-SfM	See all
Matterport3D	NeWCRFs	See all
Cityscapes 3D	TaskPrompter	See all

Show all 17 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Monocular Depth Estimation models and implementations

huggingface/transformers

3 papers

125,425

SeokjuLee/Insta-DM

3 papers

221

ShuweiShao/NDDepth

3 papers

Datasets

Latest papers with no code

Most implemented Social Latest No code

FutureDepth: Learning to Predict the Future Improves Video Depth Estimation

no code yet • 19 Mar 2024

In this paper, we propose a novel video depth estimation approach, FutureDepth, which enables the model to implicitly leverage multi-frame and motion cues to improve depth estimation by making it learn to predict the future at training.

Paper
Add Code

SSAP: A Shape-Sensitive Adversarial Patch for Comprehensive Disruption of Monocular Depth Estimation in Autonomous Navigation Applications

no code yet • 18 Mar 2024

In this paper, we introduce SSAP (Shape-Sensitive Adversarial Patch), a novel approach designed to comprehensively disrupt monocular depth estimation (MDE) in autonomous navigation applications.

Paper
Add Code

Touch-GS: Visual-Tactile Supervised 3D Gaussian Splatting

no code yet • 14 Mar 2024

Optical tactile sensors have become widespread in their use in robotics for manipulation and object representation; however, raw optical tactile sensor data is unsuitable to directly supervise a 3DGS scene.

Paper
Add Code

DD-VNB: A Depth-based Dual-Loop Framework for Real-time Visually Navigated Bronchoscopy

no code yet • 4 Mar 2024

Specifically, the relative pose changes are fed into the registration process as the initial guess to boost its accuracy and speed.

Paper
Add Code

Pyramid Feature Attention Network for Monocular Depth Prediction

no code yet • 3 Mar 2024

Deep convolutional neural networks (DCNNs) have achieved great success in monocular depth estimation (MDE).

Paper
Add Code

PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds

no code yet • 29 Feb 2024

Therefore, existing complementary learning approaches for MDE fuse intensity information from images and scene details from event data for better scene understanding.

Paper
Add Code

Zero-BEV: Zero-shot Projection of Any First-Person Modality to BEV Maps

no code yet • 21 Feb 2024

Bird's-eye view (BEV) maps are an important geometrically structured representation widely used in robotics, in particular self-driving vehicles and terrestrial robots.

Paper
Add Code

An Endoscopic Chisel: Intraoperative Imaging Carves 3D Anatomical Models

no code yet • 19 Feb 2024

Purpose: Preoperative imaging plays a pivotal role in sinus surgery where CTs offer patient-specific insights of complex anatomy, enabling real-time intraoperative navigation to complement endoscopy imaging.

Paper
Add Code

Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging Scenarios

no code yet • 19 Feb 2024

Monocular depth estimation from RGB images plays a pivotal role in 3D vision.

Paper
Add Code

MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation

no code yet • 18 Feb 2024

To address this issue, we present Motion-Aware Loss, which leverages the temporal relation among consecutive input frames and a novel distillation scheme between the teacher and student networks in the multi-frame self-supervised depth estimation methods.

Paper
Add Code

Monocular Depth Estimation

Benchmarks Add a Result

Libraries

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result