Visual Navigation

105 papers with code • 6 benchmarks • 16 datasets

Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.

Source: Vision-based Navigation Using Deep Reinforcement Learning

Libraries

Use these libraries to find Visual Navigation models and implementations

Most implemented papers

End-to-End Egospheric Spatial Memory

ivy-dl/memory 15 Feb 2021

Spatial memory, or the ability to remember and recall specific locations and objects, is central to autonomous agents' ability to carry out tasks in real environments.

Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation

PierreMarza/teaching_agents_how_to_map 13 Jul 2021

In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals.

SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

facebookresearch/sound-spaces 16 Jun 2022

We introduce SoundSpaces 2. 0, a platform for on-the-fly geometry-based audio rendering for 3D environments.

Towards Learning a Generalist Model for Embodied Navigation

zd11024/NaviLLM 4 Dec 2023

We conduct extensive experiments to evaluate the performance and generalizability of our model.

On the Performance of ConvNet Features for Place Recognition

aghagol/loop-detection 17 Jan 2015

Computer vision datasets are very different in character to robotic camera data, real-time performance is essential, and performance priorities can be different.

3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection

hengli/camodocal 31 Aug 2017

To minimize the number of cameras needed for surround perception, we utilize fisheye cameras.

The Regretful Navigation Agent for Vision-and-Language Navigation

chihyaoma/regretful-agent CVPR 2019 (Oral) 2019

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.

Drone Path-Following in GPS-Denied Environments using Convolutional Networks

cvankir2/nsf_auburn 5 May 2019

his paper presents a simple approach for drone navigation to follow a predetermined path using visual input only without reliance on a Global Positioning System (GPS).

SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation

facebookresearch/splitnet ICCV 2019

We propose SplitNet, a method for decoupling visual perception and policy learning.

Air Learning: A Deep Reinforcement Learning Gym for Autonomous Aerial Robot Visual Navigation

harvard-edge/airlearning 2 Jun 2019

We find that the trajectories on an embedded Ras-Pi are vastly different from those predicted on a high-end desktop system, resulting in up to 40% longer trajectories in one of the environments.