Grounded language learning

23 papers with code • 0 benchmarks • 1 datasets

Acquire the meaning of language in situated environments.

Datasets


Most implemented papers

Visual Entailment Task for Visually-Grounded Language Learning

necla-ml/SNLI-VE 26 Nov 2018

We introduce a new inference task - Visual Entailment (VE) - which differs from traditional Textual Entailment (TE) tasks whereby a premise is defined by an image, rather than a natural language sentence as in TE tasks.

Learning semantic sentence representations from visually grounded language without lexical knowledge

DannyMerkx/caption2image 27 Mar 2019

The system achieves state-of-the-art results on several of these benchmarks, which shows that a system trained solely on multimodal data, without assuming any word representations, is able to capture sentence level semantics.

Language learning using Speech to Image retrieval

DannyMerkx/speech2image 9 Sep 2019

Humans learn language by interaction with their environment and listening to other humans.

Emergence of Numeric Concepts in Multi-Agent Autonomous Communication

Shawn-Guo-CN/EmergentNumerals 4 Nov 2019

Although their encodeing method is not compositional like natural languages from a perspective of human beings, the emergent languages can be generalised to unseen inputs and, more importantly, are easier for models to learn.

Zero-Shot Compositional Policy Learning via Language Grounding

caotians1/BabyAIPlusPlus 15 Apr 2020

To facilitate the research on language-guided agents with domain adaption, we propose a novel zero-shot compositional policy learning task, where the environments are characterized as a composition of different attributes.

Interactive Learning from Activity Description

khanhptnk/iliad 13 Feb 2021

We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities.

Semantic sentence similarity: size does not always matter

DannyMerkx/speech2image 16 Jun 2021

This study addresses the question whether visually grounded speech recognition (VGS) models learn to capture sentence semantics without access to any prior linguistic knowledge.

SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark

vzhong/silg 20 Oct 2021

We hope SILG enables the community to quickly identify new methodologies for language grounding that generalize to a diverse set of environments and their associated challenges.

Seeing the advantage: visually grounding word embeddings to better capture human semantic knowledge

DannyMerkx/speech2image CMCL (ACL) 2022

In this paper we create visually grounded word embeddings by combining English text and images and compare them to popular text-based methods, to see if visual information allows our model to better capture cognitive aspects of word meaning.

Improving Systematic Generalization Through Modularity and Augmentation

modularcogsci2022/msa 22 Feb 2022

After training on an augmented dataset with almost forty times more adverbs than the original problem, a non-modular baseline is not able to systematically generalize to a novel combination of a known verb and adverb.