Vision and Language Navigation

88 papers with code • 5 benchmarks • 13 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Vision and Language Navigation models and implementations

Latest papers with no code

VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation

no code yet • 5 Feb 2024

Outdoor Vision-and-Language Navigation (VLN) requires an agent to navigate through realistic 3D outdoor environments based on natural language instructions.

MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation

no code yet • 14 Jan 2024

Embodied agents equipped with GPT as their brain have exhibited extraordinary decision-making and generalization abilities across various tasks.

Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model

no code yet • 30 Nov 2023

The challenging task of Vision-and-Language Navigation (VLN) requires embodied agents to follow natural language instructions to reach a goal location or object (e. g. `walk down the hallway and turn left at the piano').

DAP: Domain-aware Prompt Learning for Vision-and-Language Navigation

no code yet • 29 Nov 2023

Then we introduce soft visual prompts in the input space of the visual encoder in a pretrained model.

Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions?

no code yet • 28 Nov 2023

Data augmentation via back-translation is common when pretraining Vision-and-Language Navigation (VLN) models, even though the generated instructions are noisy.

Test-time Adaptive Vision-and-Language Navigation

no code yet • 22 Nov 2023

Then, these components are adaptively accumulated to pinpoint a concordant direction for fast model adaptation.

Vision and Language Navigation in the Real World via Online Visual Language Mapping

no code yet • 16 Oct 2023

Directly transferring SOTA navigation policies trained in simulation to the real world is challenging due to the visual domain gap and the absence of prior knowledge about unseen environments.

LangNav: Language as a Perceptual Representation for Navigation

no code yet • 11 Oct 2023

We explore the use of language as a perceptual representation for vision-and-language navigation (VLN), with a focus on low-data settings.

Evaluating Explanation Methods for Vision-and-Language Navigation

no code yet • 10 Oct 2023

The ability to navigate robots with natural language instructions in an unknown environment is a crucial step for achieving embodied artificial intelligence (AI).

Prompt-based Context- and Domain-aware Pretraining for Vision and Language Navigation

no code yet • 7 Sep 2023

In the indoor-aware stage, we apply an efficient tuning paradigm to learn deep visual prompts from an indoor dataset, in order to augment pretrained models with inductive biases towards indoor environments.