Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Reinforcement Learning (RL), among other learning-based methods, represents powerful tools to solve complex robotic tasks (e. g., actuation, manipulation, navigation, etc.
Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN).
Monocular visual navigation methods have seen significant advances in the last decade, recently producing several real-time solutions for autonomously navigating small unmanned aircraft systems without relying on GPS.
This paper learns and leverages such semantic cues for navigating to objects of interest in novel environments, by simply watching YouTube videos.
Despite the absence of absolute scale and depth range, the relative depth maps can be corrected using their respective semi-dense depth maps from the SLAM algorithm.
Visual navigation is a task of training an embodied agent by intelligently navigating to a target object (e. g., television) using only visual observations.
Training deep reinforcement learning agents on environments with multiple levels / scenes / conditions from the same task, has become essential for many applications aiming to achieve generalization and domain transfer from simulation to the real world.
Several animal species (e. g., bats, dolphins, and whales) and even visually impaired humans have the remarkable ability to perform echolocation: a biological sonar used to perceive spatial layout and locate objects in the world.