Vision and Language Navigation

88 papers with code • 5 benchmarks • 13 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Vision and Language Navigation

Dataset	Best Model	Compare
VLN Challenge	human	See all
Touchdown Dataset	ORAR + junction type + heading delta	See all
RxR	MARVAL	See all
map2seq	ORAR + junction type + heading delta	See all
robo-vln	Hierarchical Cross-Modal Agent	See all

Libraries

Use these libraries to find Vision and Language Navigation models and implementations

google-research/valan

2 papers

Datasets

Latest papers with no code

Most implemented Social Latest No code

AIGeN: An Adversarial Approach for Instruction Generation in VLN

no code yet • 15 Apr 2024

VLN is a challenging task that involves an agent following human instructions and navigating in a previously unknown environment to reach a specified goal.

Paper
Add Code

IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation

no code yet • 28 Mar 2024

To address this challenge, we propose a new method, namely, Instance-aware Visual Language Map (IVLMap), to empower the robot with instance-level and attribute-level semantic mapping, where it is autonomously constructed by fusing the RGBD video data collected from the robot agent with special-designed natural language map indexing in the bird's-in-eye view.

Paper
Add Code

Scaling Vision-and-Language Navigation With Offline RL

no code yet • 27 Mar 2024

The study of vision-and-language navigation (VLN) has typically relied on expert trajectories, which may not always be available in real-world situations due to the significant effort required to collect them.

Paper
Add Code

OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation

no code yet • 26 Mar 2024

Recent advances in Iterative Vision-and-Language Navigation (IVLN) introduce a more meaningful and practical paradigm of VLN by maintaining the agent's memory across tours of scenes.

Paper
Add Code

Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation

no code yet • 23 Mar 2024

To avoid this problem, we construct object connections based on observations from all viewpoints in the navigational environment, which ensures complete spatial coverage and eliminates the gap, called Spatial Object Relations (SOR).

Paper
Add Code

Continual Vision-and-Language Navigation

no code yet • 22 Mar 2024

For the training and evaluation of CVLN agents, we re-arrange existing VLN datasets to propose two datasets: CVLN-I, focused on navigation via initial-instruction interpretation, and CVLN-D, aimed at navigation through dialogue with other agents.

Paper
Add Code

Mind the Error! Detection and Localization of Instruction Errors in Vision-and-Language Navigation

no code yet • 15 Mar 2024

Moreover, we formally define the task of Instruction Error Detection and Localization, and establish an evaluation protocol on top of our benchmark dataset.

Paper
Add Code

Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning

no code yet • 9 Mar 2024

For encouraging the agent to well capture the difference brought by perturbation, a perturbation-aware contrastive learning mechanism is further developed by contrasting perturbation-free trajectory encodings and perturbation-based counterparts.

Paper
Add Code

Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation

no code yet • 6 Mar 2024

Vision-and-Language Navigation (VLN) has gained significant research interest in recent years due to its potential applications in real-world scenarios.

Paper
Add Code

NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation

no code yet • 24 Feb 2024

Vision-and-Language Navigation (VLN) stands as a key research problem of Embodied AI, aiming at enabling agents to navigate in unseen environments following linguistic instructions.

Paper
Add Code

Vision and Language Navigation

Benchmarks Add a Result

Libraries

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result