Visual Navigation

106 papers with code • 6 benchmarks • 16 datasets

Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.

Source: Vision-based Navigation Using Deep Reinforcement Learning

Libraries

Use these libraries to find Visual Navigation models and implementations

VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation

yanyuanqiao/vln-petl ICCV 2023

The performance of the Vision-and-Language Navigation~(VLN) tasks has witnessed rapid progress recently thanks to the use of large pre-trained vision-and-language models.

5
20 Aug 2023

Language-enhanced RNR-Map: Querying Renderable Neural Radiance Field maps with natural language

intelligolabs/Le-RNR-Map 17 Aug 2023

We present Le-RNR-Map, a Language-enhanced Renderable Neural Radiance map for Visual Navigation with natural language query prompts.

14
17 Aug 2023

Scaling Data Generation in Vision-and-Language Navigation

wz0919/scalevln ICCV 2023

Recent research in language-guided visual navigation has demonstrated a significant demand for the diversity of traversable environments and the quantity of supervision for training generalizable agents.

136
28 Jul 2023

Learning Navigational Visual Representations with Semantic Map Supervision

yiconghong/ego2map-navit ICCV 2023

Being able to perceive the semantics and the spatial structure of the environment is essential for visual navigation of a household robot.

24
23 Jul 2023

Online Self-Supervised Thermal Water Segmentation for Aerial Vehicles

connorlee77/uav-thermal-water-segmentation 18 Jul 2023

We present a new method to adapt an RGB-trained water segmentation network to target-domain aerial thermal imagery using online self-supervision by leveraging texture and motion cues as supervisory signals.

12
18 Jul 2023

The Drunkard's Odometry: Estimating Camera Motion in Deforming Scenes

UZ-SLAMLab/DrunkardsOdometry 29 Jun 2023

Estimating camera motion in deformable scenes poses a complex and open research challenge.

45
29 Jun 2023

HabiCrowd: A High Performance Simulator for Crowd-Aware Visual Navigation

habicrowd/HabiCrowd 20 Jun 2023

Visual navigation, a foundational aspect of Embodied AI (E-AI), has been significantly studied in the past few years.

20
20 Jun 2023

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear

stanfordvl/sonicverse 1 Jun 2023

We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear.

16
01 Jun 2023

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models

gengzezhou/navgpt 26 May 2023

Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling.

81
26 May 2023

POPGym: Benchmarking Partially Observable Reinforcement Learning

proroklab/popgym 3 Mar 2023

Real world applications of Reinforcement Learning (RL) are often partially observable, thus requiring memory.

145
03 Mar 2023