Visual Navigation

105 papers with code • 6 benchmarks • 16 datasets

Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.

Source: Vision-based Navigation Using Deep Reinforcement Learning

Benchmarks

Add a Result

These leaderboards are used to track progress in Visual Navigation

Dataset	Best Model	Compare
R2R	Meta-Explore	See all
Cooperative Vision-and-Dialogue Navigation	NaviLLM	See all
SOON Test	AutoVLN	See all
AI2-THOR	MVV-IN	See all
Dmlab-30	PopArt-IMPALA	See all
Help, Anna! (HANNA)	Prevalent	See all

Libraries

Use these libraries to find Visual Navigation models and implementations

mchancan/citylearn

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

End-to-End Egospheric Spatial Memory

ivy-dl/memory • • 15 Feb 2021

Spatial memory, or the ability to remember and recall specific locations and objects, is central to autonomous agents' ability to carry out tasks in real environments.

Paper
Code

Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation

PierreMarza/teaching_agents_how_to_map • • 13 Jul 2021

In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals.

Paper
Code

SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

facebookresearch/sound-spaces • • 16 Jun 2022

We introduce SoundSpaces 2. 0, a platform for on-the-fly geometry-based audio rendering for 3D environments.

Paper
Code

Towards Learning a Generalist Model for Embodied Navigation

zd11024/NaviLLM • • 4 Dec 2023

We conduct extensive experiments to evaluate the performance and generalizability of our model.

Paper
Code

On the Performance of ConvNet Features for Place Recognition

aghagol/loop-detection • 17 Jan 2015

Computer vision datasets are very different in character to robotic camera data, real-time performance is essential, and performance priorities can be different.

Paper
Code

3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection

hengli/camodocal • 31 Aug 2017

To minimize the number of cameras needed for surround perception, we utilize fisheye cameras.

Paper
Code

The Regretful Navigation Agent for Vision-and-Language Navigation

chihyaoma/regretful-agent • • CVPR 2019 (Oral) 2019

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.

Paper
Code

Drone Path-Following in GPS-Denied Environments using Convolutional Networks

cvankir2/nsf_auburn • 5 May 2019

his paper presents a simple approach for drone navigation to follow a predetermined path using visual input only without reliance on a Global Positioning System (GPS).

Paper
Code

SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation

facebookresearch/splitnet • • ICCV 2019

We propose SplitNet, a method for decoupling visual perception and policy learning.

Paper
Code

Air Learning: A Deep Reinforcement Learning Gym for Autonomous Aerial Robot Visual Navigation

harvard-edge/airlearning • 2 Jun 2019

We find that the trajectories on an embedded Ras-Pi are vastly different from those predicted on a high-end desktop system, resulting in up to 40% longer trajectories in one of the environments.

Paper
Code

Visual Navigation

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result