Search Results for author: Alessandro Suglia

Found 19 papers, 8 papers with code

Dialogue Act and Slot Recognition in Italian Complex Dialogues

no code implementations • EURALI (LREC) 2022 • Irene Sucameli, Michele De Quattro, Arash Eshghi, Alessandro Suglia, Maria Simi

Since the advent of Transformer-based, pretrained language models (LM) such as BERT, Natural Language Understanding (NLU) components in the form of Dialogue Act Recognition (DAR) and Slot Recognition (SR) for dialogue systems have become both more accurate and easier to create for specific application domains.

Natural Language Understanding

Paper
Add Code

ACT-Thor: A Controlled Benchmark for Embodied Action Understanding in Simulated Environments

1 code implementation • COLING 2022 • Michael Hanna, Federico Pedeni, Alessandro Suglia, Alberto Testoni, Raffaella Bernardi

This paves the way for a systematic way of evaluating embodied AI agents that understand grounded actions.

Action Understanding

Paper
Code

Combine to Describe: Evaluating Compositional Generalization in Image Captioning

no code implementations • ACL 2022 • George Pantazopoulos, Alessandro Suglia, Arash Eshghi

Compositionality – the ability to combine simpler concepts to understand & generate arbitrarily more complex conceptual structures – has long been thought to be the cornerstone of human language capacity.

Image Captioning

Paper
Add Code

Demonstrating EMMA: Embodied MultiModal Agent for Language-guided Action Execution in 3D Simulated Environments

no code implementations • SIGDIAL (ACL) 2022 • Alessandro Suglia, Bhathiya Hemanthage, Malvina Nikandrou, George Pantazopoulos, Amit Parekh, Arash Eshghi, Claudio Greco, Ioannis Konstas, Oliver Lemon, Verena Rieser

We demonstrate EMMA, an embodied multimodal agent which has been developed for the Alexa Prize SimBot challenge.

Conditional Text Generation

Paper
Add Code

Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks

1 code implementation • 7 May 2024 • Georgios Pantazopoulos, Amit Parekh, Malvina Nikandrou, Alessandro Suglia

Augmenting Large Language Models (LLMs) with image-understanding capabilities has resulted in a boom of high-performing Vision-Language models (VLMs).

Paper
Code

Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers

1 code implementation • 21 Apr 2024 • Georgios Pantazopoulos, Alessandro Suglia, Oliver Lemon, Arash Eshghi

In this paper, we use \textit{diagnostic classifiers} to measure the extent to which the visual prompt produced by the resampler encodes spatial information.

Image Captioning Question Answering +1

Paper
Code

PIXAR: Auto-Regressive Language Modeling in Pixel Space

no code implementations • 6 Jan 2024 • Yintao Tai, Xiyang Liao, Alessandro Suglia, Antonio Vergari

However, these pixel-based LLMs are limited to discriminative tasks (e. g., classification) and, similar to BERT, cannot be used to generate text.

Decoder LAMBADA +4

Paper
Add Code

Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement Learning

1 code implementation • 7 Dec 2023 • Sabrina McCallum, Max Taylor-Davies, Stefano V. Albrecht, Alessandro Suglia

Despite numerous successes, the field of reinforcement learning (RL) remains far from matching the impressive generalisation power of human behaviour learning.

Reinforcement Learning (RL)

Paper
Code

Visually Grounded Language Learning: a review of language games, datasets, tasks, and models

no code implementations • 5 Dec 2023 • Alessandro Suglia, Ioannis Konstas, Oliver Lemon

Our analysis of the literature provides evidence that future work should be focusing on interactive games where communication in Natural Language is important to resolve ambiguities about object referents and action plans and that physical embodiment is essential to understand the semantics of situations and events.

Grounded language learning Language Modelling +1

Paper
Add Code

Multitask Multimodal Prompted Training for Interactive Embodied Task Completion

no code implementations • 7 Nov 2023 • Georgios Pantazopoulos, Malvina Nikandrou, Amit Parekh, Bhathiya Hemanthage, Arash Eshghi, Ioannis Konstas, Verena Rieser, Oliver Lemon, Alessandro Suglia

Interactive and embodied tasks pose at least two fundamental challenges to existing Vision & Language (VL) models, including 1) grounding language in trajectories of actions and observations, and 2) referential disambiguation.

Decoder Text Generation

Paper
Add Code

'What are you referring to?' Evaluating the Ability of Multi-Modal Dialogue Models to Process Clarificational Exchanges

1 code implementation • 28 Jul 2023 • Javier Chiyah-Garcia, Alessandro Suglia, Arash Eshghi, Helen Hastie

Referential ambiguities arise in dialogue when a referring expression does not uniquely identify the intended referent for the addressee.

Referring Expression

Paper
Code

Going for GOAL: A Resource for Grounded Football Commentaries

1 code implementation • 8 Nov 2022 • Alessandro Suglia, José Lopes, Emanuele Bastianelli, Andrea Vanzo, Shubham Agarwal, Malvina Nikandrou, Lu Yu, Ioannis Konstas, Verena Rieser

As the course of a game is unpredictable, so are commentaries, which makes them a unique resource to investigate dynamic language grounding.

Moment Retrieval Retrieval

Paper
Code

Task Formulation Matters When Learning Continually: A Case Study in Visual Question Answering

no code implementations • 30 Sep 2022 • Mavina Nikandrou, Lu Yu, Alessandro Suglia, Ioannis Konstas, Verena Rieser

We first propose three plausible task formulations and demonstrate their impact on the performance of continual learning algorithms.

Continual Learning Question Answering +1

Paper
Add Code

Exploring Multi-Modal Representations for Ambiguity Detection & Coreference Resolution in the SIMMC 2.0 Challenge

2 code implementations • 25 Feb 2022 • Javier Chiyah-Garcia, Alessandro Suglia, José Lopes, Arash Eshghi, Helen Hastie

Anaphoric expressions, such as pronouns and referential descriptions, are situated with respect to the linguistic context of prior turns, as well as, the immediate visual environment.

coreference-resolution

Paper
Code

Embodied BERT: A Transformer Model for Embodied, Language-guided Visual Task Completion

1 code implementation • 10 Aug 2021 • Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme

Language-guided robots performing home and office tasks must navigate in and interact with the world.

Navigate Object

Paper
Code

An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games

no code implementations • EACL 2021 • Alessandro Suglia, Yonatan Bisk, Ioannis Konstas, Antonio Vergari, Emanuele Bastianelli, Andrea Vanzo, Oliver Lemon

Guessing games are a prototypical instance of the "learning by interacting" paradigm.

Question Answering Visual Question Answering

Paper
Add Code

Imagining Grounded Conceptual Representations from Perceptual Information in Situated Guessing Games

no code implementations • COLING 2020 • Alessandro Suglia, Antonio Vergari, Ioannis Konstas, Yonatan Bisk, Emanuele Bastianelli, Andrea Vanzo, Oliver Lemon

However, as shown by Suglia et al. (2020), existing models fail to learn truly multi-modal representations, relying instead on gold category labels for objects in the scene both at training and inference time.

Object

Paper
Add Code

CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning

no code implementations • ACL 2020 • Alessandro Suglia, Ioannis Konstas, Andrea Vanzo, Emanuele Bastianelli, Desmond Elliott, Stella Frank, Oliver Lemon

To remedy this, we present GROLLA, an evaluation framework for Grounded Language Learning with Attributes with three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation.

Attribute Grounded language learning

Paper
Add Code

Iterative Multi-document Neural Attention for Multiple Answer Prediction

no code implementations • 8 Feb 2017 • Claudio Greco, Alessandro Suglia, Pierpaolo Basile, Gaetano Rossiello, Giovanni Semeraro

People have information needs of varying complexity, which can be solved by an intelligent agent able to answer questions formulated in a proper way, eventually considering user context and preferences.

Question Answering Recommendation Systems

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.