Monocular Depth Estimation

339 papers with code • 17 benchmarks • 26 datasets

Monocular Depth Estimation is the task of estimating the depth value (distance relative to the camera) of each pixel given a single (monocular) RGB image. This challenging task is a key prerequisite for determining scene understanding for applications such as 3D scene reconstruction, autonomous driving, and AR. State-of-the-art methods usually fall into one of two categories: designing a complex network that is powerful enough to directly regress the depth map, or splitting the input into bins or windows to reduce computational complexity. The most popular benchmarks are the KITTI and NYUv2 datasets. Models are typically evaluated using RMSE or absolute relative error.

Source: Defocus Deblurring Using Dual-Pixel Data

Libraries

Use these libraries to find Monocular Depth Estimation models and implementations

METER: a mobile vision transformer architecture for monocular depth estimation

lorenzopapa5/meter 13 Mar 2024

State of the art MDE models typically rely on vision transformers (ViT) architectures that are highly deep and complex, making them unsuitable for fast inference on devices with hardware constraints.

3
13 Mar 2024

Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving

junda24/afnet 12 Mar 2024

In this work, we propose a new robustness benchmark to evaluate the depth estimation system under various noisy pose settings.

61
12 Mar 2024

D4D: An RGBD diffusion model to boost monocular depth estimation

lorenzopapa5/diffusion4d 12 Mar 2024

Ground-truth RGBD data are fundamental for a wide range of computer vision applications; however, those labeled samples are difficult to collect and time-consuming to produce.

0
12 Mar 2024

Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation

hitcslj/ssd 8 Mar 2024

This paper introduces a novel approach named Stealing Stable Diffusion (SSD) prior for robust monocular depth estimation.

20
08 Mar 2024

Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving

owen-liuyuxuan/visionfactory 4 Mar 2024

Collectively, these contributions lay a robust foundation for the widespread adoption of vision-based 3D perception technologies in autonomous driving applications.

26
04 Mar 2024

Kick Back & Relax++: Scaling Beyond Ground-Truth Depth with SlowTV & CribsTV

jspenmar/slowtv_monodepth 3 Mar 2024

Self-supervised learning is the key to unlocking generic computer vision systems.

81
03 Mar 2024

TIE-KD: Teacher-Independent and Explainable Knowledge Distillation for Monocular Depth Estimation

hpc-lab-koreatech/tie-kd 22 Feb 2024

Monocular depth estimation (MDE) is essential for numerous applications yet is impeded by the substantial computational demands of accurate deep learning models.

3
22 Feb 2024

Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting

hustvl/4DGaussians 29 Jan 2024

In the realm of robot-assisted minimally invasive surgery, dynamic scene reconstruction can significantly enhance downstream tasks and improve surgical outcomes.

1,665
29 Jan 2024

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

LiheYoung/Depth-Anything 19 Jan 2024

To this end, we scale up the dataset by designing a data engine to collect and automatically annotate large-scale unlabeled data (~62M), which significantly enlarges the data coverage and thus is able to reduce the generalization error.

5,668
19 Jan 2024

A Study on Self-Supervised Pretraining for Vision Problems in Gastrointestinal Endoscopy

esandml/ssl4gie 11 Jan 2024

In this work, we study the fine-tuned performance of models with ResNet50 and ViT-B backbones pretrained in self-supervised and supervised manners with ImageNet-1k and Hyperkvasir-unlabelled (self-supervised only) in a range of GIE vision tasks.

1
11 Jan 2024