Visual Navigation

106 papers with code • 6 benchmarks • 16 datasets

Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.

Source: Vision-based Navigation Using Deep Reinforcement Learning

Benchmarks

Add a Result

These leaderboards are used to track progress in Visual Navigation

Dataset	Best Model	Compare
R2R	Meta-Explore	See all
Cooperative Vision-and-Dialogue Navigation	NaviLLM	See all
SOON Test	AutoVLN	See all
AI2-THOR	MVV-IN	See all
Dmlab-30	PopArt-IMPALA	See all
Help, Anna! (HANNA)	Prevalent	See all

Libraries

Use these libraries to find Visual Navigation models and implementations

mchancan/citylearn

2 papers

Datasets

Latest papers

Most implemented Social Latest No code

VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation

yanyuanqiao/vln-petl • ICCV 2023

The performance of the Vision-and-Language Navigation~(VLN) tasks has witnessed rapid progress recently thanks to the use of large pre-trained vision-and-language models.

20 Aug 2023

Paper
Code

Language-enhanced RNR-Map: Querying Renderable Neural Radiance Field maps with natural language

intelligolabs/Le-RNR-Map • 17 Aug 2023

We present Le-RNR-Map, a Language-enhanced Renderable Neural Radiance map for Visual Navigation with natural language query prompts.

17 Aug 2023

Paper
Code

Scaling Data Generation in Vision-and-Language Navigation

wz0919/scalevln • • ICCV 2023

Recent research in language-guided visual navigation has demonstrated a significant demand for the diversity of traversable environments and the quantity of supervision for training generalizable agents.

136

28 Jul 2023

Paper
Code

Learning Navigational Visual Representations with Semantic Map Supervision

yiconghong/ego2map-navit • ICCV 2023

Being able to perceive the semantics and the spatial structure of the environment is essential for visual navigation of a household robot.

23 Jul 2023

Paper
Code

Online Self-Supervised Thermal Water Segmentation for Aerial Vehicles

connorlee77/uav-thermal-water-segmentation • • 18 Jul 2023

We present a new method to adapt an RGB-trained water segmentation network to target-domain aerial thermal imagery using online self-supervision by leveraging texture and motion cues as supervisory signals.

18 Jul 2023

Paper
Code

The Drunkard's Odometry: Estimating Camera Motion in Deforming Scenes

UZ-SLAMLab/DrunkardsOdometry • • 29 Jun 2023

Estimating camera motion in deformable scenes poses a complex and open research challenge.

29 Jun 2023

Paper
Code

HabiCrowd: A High Performance Simulator for Crowd-Aware Visual Navigation

habicrowd/HabiCrowd • • 20 Jun 2023

Visual navigation, a foundational aspect of Embodied AI (E-AI), has been significantly studied in the past few years.

20 Jun 2023

Paper
Code

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear

stanfordvl/sonicverse • 1 Jun 2023

We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear.

01 Jun 2023

Paper
Code

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models

gengzezhou/navgpt • 26 May 2023

Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling.

26 May 2023

Paper
Code

POPGym: Benchmarking Partially Observable Reinforcement Learning

proroklab/popgym • • 3 Mar 2023

Real world applications of Reinforcement Learning (RL) are often partially observable, thus requiring memory.

145

03 Mar 2023

Paper
Code

Visual Navigation

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result