no code implementations • SIGDIAL (ACL) 2022 • Alessandro Suglia, Bhathiya Hemanthage, Malvina Nikandrou, George Pantazopoulos, Amit Parekh, Arash Eshghi, Claudio Greco, Ioannis Konstas, Oliver Lemon, Verena Rieser
We demonstrate EMMA, an embodied multimodal agent which has been developed for the Alexa Prize SimBot challenge.
no code implementations • 7 Nov 2023 • Georgios Pantazopoulos, Malvina Nikandrou, Amit Parekh, Bhathiya Hemanthage, Arash Eshghi, Ioannis Konstas, Verena Rieser, Oliver Lemon, Alessandro Suglia
Interactive and embodied tasks pose at least two fundamental challenges to existing Vision & Language (VL) models, including 1) grounding language in trajectories of actions and observations, and 2) referential disambiguation.
no code implementations • 10 Jul 2023 • Bhathiya Hemanthage, Christian Dondrup, Phil Bartie, Oliver Lemon
SimpleMTOD is a simple language model which recasts several sub-tasks in multimodal task-oriented dialogues as sequence prediction tasks.