Vision and Language Navigation

85 papers with code • 5 benchmarks • 13 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Vision and Language Navigation models and implementations

Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation

18979705623/hspr 18 Mar 2024

Most Vision-and-Language Navigation (VLN) algorithms tend to make decision errors, primarily due to a lack of visual common sense and insufficient reasoning capabilities.

3
18 Mar 2024

NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning

expectorlin/navcot 12 Mar 2024

Vision-and-Language Navigation (VLN), as a crucial research problem of Embodied AI, requires an embodied agent to navigate through complex 3D environments following natural language instructions.

8
12 Mar 2024

WebLINX: Real-World Website Navigation with Multi-Turn Dialogue

McGill-NLP/weblinx 8 Feb 2024

We propose the problem of conversational web navigation, where a digital agent controls a web browser and follows user instructions to solve real-world tasks in a multi-turn dialogue fashion.

57
08 Feb 2024

NavHint: Vision and Language Navigation Agent with a Hint Generator

hlr/navhint 4 Feb 2024

The hint generator assists the navigation agent in developing a global understanding of the visual environment.

2
04 Feb 2024

WebVLN: Vision-and-Language Navigation on Websites

webvln/webvln 25 Dec 2023

Vision-and-Language Navigation (VLN) task aims to enable AI agents to accurately understand and follow natural language instructions to navigate through real-world environments, ultimately reaching specific target locations.

14
25 Dec 2023

Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation

csir1996/vln-gela ICCV 2023

To address this problem, we propose a novel Grounded Entity-Landmark Adaptive (GELA) pre-training paradigm for VLN tasks.

14
24 Aug 2023

March in Chat: Interactive Prompting for Remote Embodied Referring Expression

yanyuanqiao/mic ICCV 2023

Nevertheless, this poses more challenges than other VLN tasks since it requires agents to infer a navigation plan only based on a short instruction.

19
20 Aug 2023

VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation

yanyuanqiao/vln-petl ICCV 2023

The performance of the Vision-and-Language Navigation~(VLN) tasks has witnessed rapid progress recently thanks to the use of large pre-trained vision-and-language models.

5
20 Aug 2023

AerialVLN: Vision-and-Language Navigation for UAVs

airvln/airvln ICCV 2023

Navigating in the sky is more complicated than on the ground because agents need to consider the flying height and more complex spatial relationship reasoning.

27
13 Aug 2023

Scaling Data Generation in Vision-and-Language Navigation

wz0919/scalevln ICCV 2023

Recent research in language-guided visual navigation has demonstrated a significant demand for the diversity of traversable environments and the quantity of supervision for training generalizable agents.

129
28 Jul 2023