Search Results for author: Alberto Testoni

Found 15 papers, 6 papers with code

A Small but Informed and Diverse Model: The Case of the Multimodal GuessWhat!? Guessing Game

no code implementations CLASP 2022 Claudio Greco, Alberto Testoni, Raffaella Bernardi, Stella Frank

Pre-trained Vision and Language Transformers achieve high performance on downstream tasks due to their ability to transfer representational knowledge accumulated during pretraining on substantial amounts of data.

They Are Not All Alike: Answering Different Spatial Questions Requires Different Grounding Strategies

no code implementations EMNLP (SpLU) 2020 Alberto Testoni, Claudio Greco, Tobias Bianchi, Mauricio Mazuecos, Agata Marcante, Luciana Benotti, Raffaella Bernardi

By analyzing LXMERT errors and its attention mechanisms, we find that our classification helps to gain a better understanding of the skills required to answer different spatial questions.

Visually Grounded Follow-up Questions: a Dataset of Spatial Questions Which Require Dialogue History

1 code implementation ACL (splurobonlp) 2021 Tianai Dong, Alberto Testoni, Luciana Benotti, Raffaella Bernardi

We call the question that restricts the context: trigger, and we call the spatial question that requires the trigger question to be answered: zoomer.

Naming, Describing, and Quantifying Visual Objects in Humans and LLMs

1 code implementation11 Mar 2024 Alberto Testoni, Juell Sprott, Sandro Pezzelle

While human speakers use a variety of different expressions when describing the same object in an image, giving rise to a distribution of plausible labels driven by pragmatic constraints, the extent to which current Vision \& Language Large Language Models (VLLMs) can mimic this crucial feature of language use is an open question.

Asking the Right Question at the Right Time: Human and Model Uncertainty Guidance to Ask Clarification Questions

no code implementations9 Feb 2024 Alberto Testoni, Raquel Fernández

Clarification questions are an essential dialogue tool to signal misunderstanding, ambiguities, and under-specification in language use.

Looking for Confirmations: An Effective and Human-Like Visual Dialogue Strategy

1 code implementation EMNLP 2021 Alberto Testoni, Raffaella Bernardi

Inspired by the cognitive literature on information search and cross-situational word learning, we design Confirm-it, a model based on a beam search re-ranking algorithm that guides an effective goal-oriented strategy by asking questions that confirm the model's conjecture about the referent.

Re-Ranking

``I've Seen Things You People Wouldn't Believe'': Hallucinating Entities in GuessWhat?!

no code implementations ACL 2021 Alberto Testoni, Raffaella Bernardi

We also analyse where hallucinations tend to occur more often through the dialogue: hallucinations are less frequent in earlier turns, cause a cascade hallucination effect, and are often preceded by negative answers, which have been shown to be harder to ground.

Hallucination Image Captioning +1

Overprotective Training Environments Fall Short at Testing Time: Let Models Contribute to Their Own Training

no code implementations20 Mar 2021 Alberto Testoni, Raffaella Bernardi

Despite important progress, conversational systems often generate dialogues that sound unnatural to humans.

The Interplay of Task Success and Dialogue Quality: An in-depth Evaluation in Task-Oriented Visual Dialogues

1 code implementation EACL 2021 Alberto Testoni, Raffaella Bernardi

When training a model on referential dialogue guessing games, the best model is usually chosen based on its task success.

On the role of effective and referring questions in GuessWhat?!

no code implementations WS 2020 Mauricio Mazuecos, Alberto Testoni, Raffaella Bernardi, Luciana Benotti

Regarding our first metric, we find that successful dialogues do not have a higher percentage of effective questions for most models.

Quantifiers in a Multimodal World: Hallucinating Vision with Language and Sound

no code implementations WS 2019 Alberto Testoni, S Pezzelle, ro, Raffaella Bernardi

Inspired by the literature on multisensory integration, we develop a computational model to ground quantifiers in perception.

Cannot find the paper you are looking for? You can Submit a new open access paper.