Grounded language learning
23 papers with code • 0 benchmarks • 1 datasets
Acquire the meaning of language in situated environments.
Benchmarks
These leaderboards are used to track progress in Grounded language learning
Most implemented papers
Visual Entailment Task for Visually-Grounded Language Learning
We introduce a new inference task - Visual Entailment (VE) - which differs from traditional Textual Entailment (TE) tasks whereby a premise is defined by an image, rather than a natural language sentence as in TE tasks.
Learning semantic sentence representations from visually grounded language without lexical knowledge
The system achieves state-of-the-art results on several of these benchmarks, which shows that a system trained solely on multimodal data, without assuming any word representations, is able to capture sentence level semantics.
Language learning using Speech to Image retrieval
Humans learn language by interaction with their environment and listening to other humans.
Emergence of Numeric Concepts in Multi-Agent Autonomous Communication
Although their encodeing method is not compositional like natural languages from a perspective of human beings, the emergent languages can be generalised to unseen inputs and, more importantly, are easier for models to learn.
Zero-Shot Compositional Policy Learning via Language Grounding
To facilitate the research on language-guided agents with domain adaption, we propose a novel zero-shot compositional policy learning task, where the environments are characterized as a composition of different attributes.
Interactive Learning from Activity Description
We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities.
Semantic sentence similarity: size does not always matter
This study addresses the question whether visually grounded speech recognition (VGS) models learn to capture sentence semantics without access to any prior linguistic knowledge.
SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark
We hope SILG enables the community to quickly identify new methodologies for language grounding that generalize to a diverse set of environments and their associated challenges.
Seeing the advantage: visually grounding word embeddings to better capture human semantic knowledge
In this paper we create visually grounded word embeddings by combining English text and images and compare them to popular text-based methods, to see if visual information allows our model to better capture cognitive aspects of word meaning.
Improving Systematic Generalization Through Modularity and Augmentation
After training on an augmented dataset with almost forty times more adverbs than the original problem, a non-modular baseline is not able to systematically generalize to a novel combination of a known verb and adverb.