Visual Navigation
101 papers with code • 6 benchmarks • 16 datasets
Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.
Source: Vision-based Navigation Using Deep Reinforcement Learning
Libraries
Use these libraries to find Visual Navigation models and implementationsLatest papers
Learning Navigational Visual Representations with Semantic Map Supervision
Being able to perceive the semantics and the spatial structure of the environment is essential for visual navigation of a household robot.
Online Self-Supervised Thermal Water Segmentation for Aerial Vehicles
We present a new method to adapt an RGB-trained water segmentation network to target-domain aerial thermal imagery using online self-supervision by leveraging texture and motion cues as supervisory signals.
The Drunkard's Odometry: Estimating Camera Motion in Deforming Scenes
Estimating camera motion in deformable scenes poses a complex and open research challenge.
HabiCrowd: A High Performance Simulator for Crowd-Aware Visual Navigation
Visual navigation, a foundational aspect of Embodied AI (E-AI), has been significantly studied in the past few years.
Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear
We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear.
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling.
POPGym: Benchmarking Partially Observable Reinforcement Learning
Real world applications of Reinforcement Learning (RL) are often partially observable, thus requiring memory.
Learning by Asking for Embodied Visual Navigation and Task Completion
The research community has shown increasing interest in designing intelligent embodied agents that can assist humans in accomplishing tasks.
Offline Reinforcement Learning for Visual Navigation
Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.
BEVBert: Multimodal Map Pre-training for Language-guided Navigation
Concretely, we build a local metric map to explicitly aggregate incomplete observations and remove duplicates, while modeling navigation dependency in a global topological map.