Visual Navigation

105 papers with code • 6 benchmarks • 16 datasets

Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.

Source: Vision-based Navigation Using Deep Reinforcement Learning

Benchmarks

Add a Result

These leaderboards are used to track progress in Visual Navigation

Dataset	Best Model	Compare
R2R	Meta-Explore	See all
Cooperative Vision-and-Dialogue Navigation	NaviLLM	See all
SOON Test	AutoVLN	See all
AI2-THOR	MVV-IN	See all
Dmlab-30	PopArt-IMPALA	See all
Help, Anna! (HANNA)	Prevalent	See all

Libraries

Use these libraries to find Visual Navigation models and implementations

mchancan/citylearn

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Self-Monitoring Navigation Agent via Auxiliary Progress Estimation

chihyaoma/selfmonitoring-agent • • ICLR 2019

The Vision-and-Language Navigation (VLN) task entails an agent following navigational instruction in photo-realistic unknown environments.

Paper
Code

Learning Exploration Policies for Navigation

taochenshh/exp4nav • • ICLR 2019

Numerous past works have tackled the problem of task-driven navigation.

Paper
Code

Scaling and Benchmarking Self-Supervised Visual Representation Learning

facebookresearch/fair_self_supervision_benchmark • • ICCV 2019

Self-supervised learning aims to learn representations from the data itself without explicit manual supervision.

Paper
Code

An Open Source and Open Hardware Deep Learning-powered Visual Navigation Engine for Autonomous Nano-UAVs

pulp-platform/pulp-dronet • • 10 May 2019

Nano-size unmanned aerial vehicles (UAVs), with few centimeters of diameter and sub-10 Watts of total power budget, have so far been considered incapable of running sophisticated visual-based autonomous navigation software without external aid from base-stations, ad-hoc local positioning infrastructure, and powerful external computation servers.

Paper
Code

Vision-and-Dialog Navigation

mmurray/cvdn • • 10 Jul 2019

To train agents that search an environment for a goal location, we define the Navigation from Dialog History task.

Paper
Code

VUSFA:Variational Universal Successor Features Approximator to Improve Transfer DRL for Target Driven Visual Navigation

shamanez/VUSFA-Variational-Universal-Successor-Features-Approximator • • 18 Aug 2019

In this paper, we show how novel transfer reinforcement learning techniques can be applied to the complex task of target driven navigation using the photorealistic AI2THOR simulator.

Paper
Code

SoundSpaces: Audio-Visual Navigation in 3D Environments

facebookresearch/sound-spaces • • ECCV 2020

Moving around in the world is naturally a multisensory experience, but today's embodied agents are deaf---restricted to solely their visual perception of the environment.

Paper
Code

Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks

jozhang97/side-tuning • • ECCV 2020

When training a neural network for a desired task, one may prefer to adapt a pre-trained network rather than starting from randomly initialized weights.

Paper
Code