Monocular Depth Estimation

338 papers with code • 18 benchmarks • 26 datasets

Monocular Depth Estimation is the task of estimating the depth value (distance relative to the camera) of each pixel given a single (monocular) RGB image. This challenging task is a key prerequisite for determining scene understanding for applications such as 3D scene reconstruction, autonomous driving, and AR. State-of-the-art methods usually fall into one of two categories: designing a complex network that is powerful enough to directly regress the depth map, or splitting the input into bins or windows to reduce computational complexity. The most popular benchmarks are the KITTI and NYUv2 datasets. Models are typically evaluated using RMSE or absolute relative error.

Source: Defocus Deblurring Using Dual-Pixel Data

Libraries

Use these libraries to find Monocular Depth Estimation models and implementations

Virtually Enriched NYU Depth V2 Dataset for Monocular Depth Estimation: Do We Need Artificial Augmentation?

abrain-one/anyu 15 Apr 2024

We present ANYU, a new virtually augmented version of the NYU depth v2 dataset, designed for monocular depth estimation.

0
15 Apr 2024

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

agneet42/robustness_depth_lang 12 Apr 2024

Recent advances in monocular depth estimation have been made by incorporating natural language as additional guidance.

4
12 Apr 2024

RoadBEV: Road Surface Reconstruction in Bird's Eye View

ztsrxh/roadbev 9 Apr 2024

This paper uniformly proposes two simple yet effective models for road elevation reconstruction in BEV named RoadBEV-mono and RoadBEV-stereo, which estimate road elevation with monocular and stereo images, respectively.

42
09 Apr 2024

WorDepth: Variational Language Prior for Monocular Depth Estimation

adonis-galaxy/wordepth 4 Apr 2024

To test this, we focus on monocular depth estimation, the problem of predicting a dense depth map from a single image, but with an additional text caption describing the scene.

12
04 Apr 2024

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

skmhrk1209/VSRD 2 Apr 2024

In the auto-labeling stage, we represent the surface of each instance as a signed distance field (SDF) and render its silhouette as an instance mask through our proposed instance-aware volumetric silhouette rendering.

17
02 Apr 2024

UniDepth: Universal Monocular Metric Depth Estimation

lpiccinelli-eth/unidepth 27 Mar 2024

However, the remarkable accuracy of recent MMDE methods is confined to their training domains.

296
27 Mar 2024

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation

aradhye2002/ecodepth 27 Mar 2024

We argue that the embedding vector from a ViT model, pre-trained on a large dataset, captures greater relevant information for SIDE than the usual route of generating pseudo image captions, followed by CLIP based text embeddings.

93
27 Mar 2024

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

gandolfczjh/3d2fool 26 Mar 2024

Deep learning-based monocular depth estimation (MDE), extensively applied in autonomous driving, is known to be vulnerable to adversarial attacks.

6
26 Mar 2024

SwinMTL: A Shared Architecture for Simultaneous Depth Estimation and Semantic Segmentation from Monocular Camera Images

pardistaghavi/swinmtl 15 Mar 2024

This research paper presents an innovative multi-task learning framework that allows concurrent depth estimation and semantic segmentation using a single camera.

5
15 Mar 2024

METER: a mobile vision transformer architecture for monocular depth estimation

lorenzopapa5/meter 13 Mar 2024

State of the art MDE models typically rely on vision transformers (ViT) architectures that are highly deep and complex, making them unsuitable for fast inference on devices with hardware constraints.

3
13 Mar 2024