no code implementations • ICLR 2019 • Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston, Douwe Kiela
We introduce `"Talk The Walk", the first large-scale dialogue dataset grounded in action and perception.
no code implementations • 7 Jun 2023 • Jing Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster
We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety.
no code implementations • 7 Jun 2023 • Morteza Behrooz, William Ngan, Joshua Lane, Giuliano Morse, Benjamin Babcock, Kurt Shuster, Mojtaba Komeili, Moya Chen, Melanie Kambadur, Y-Lan Boureau, Jason Weston
Publicly deploying research chatbots is a nuanced topic involving necessary risk-benefit analyses.
no code implementations • 26 Apr 2023 • Jimmy Wei, Kurt Shuster, Arthur Szlam, Jason Weston, Jack Urbanek, Mojtaba Komeili
We compare models trained on our new dataset to existing pairwise-trained dialogue models, as well as large language models with few-shot prompting.
1 code implementation • 22 Dec 2022 • Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov
To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.
Ranked #26 on Natural Language Inference on RTE
no code implementations • 21 Dec 2022 • Chris Lengerich, Gabriel Synnaeve, Amy Zhang, Hugh Leather, Kurt Shuster, François Charton, Charysse Redwood
Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization.
no code implementations • 10 Nov 2022 • Leonard Adolphs, Tianyu Gao, Jing Xu, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston
Standard language model training employs gold human documents or human-human interaction data, and treats all training data as positive examples.
no code implementations • 28 Oct 2022 • Weiyan Shi, Emily Dinan, Kurt Shuster, Jason Weston, Jing Xu
Deployed dialogue agents have the potential to integrate human feedback to continuously improve themselves.
2 code implementations • 5 Aug 2022 • Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston
We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.
1 code implementation • 15 Jun 2022 • Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston
Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness and contradictions.
7 code implementations • 2 May 2022 • Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer
Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning.
Ranked #2 on Stereotypical Bias Analysis on CrowS-Pairs
1 code implementation • 24 Mar 2022 • Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston
We show that, when using SeeKeR as a dialogue model, it outperforms the state-of-the-art model BlenderBot 2 (Chen et al., 2021) on open-domain knowledge-grounded conversations for the same number of parameters, in terms of consistency, knowledge and per-turn engagingness.
no code implementations • Findings (NAACL) 2022 • Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston
State-of-the-art dialogue models still often stumble with regards to factual accuracy and self-contradiction.
no code implementations • 9 Nov 2021 • Leonard Adolphs, Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston
Large language models can produce fluent dialogue but often hallucinate factual inaccuracies.
no code implementations • ACL 2022 • Mojtaba Komeili, Kurt Shuster, Jason Weston
The largest store of continually updating knowledge on our planet can be accessed via internet search.
no code implementations • Findings (EMNLP) 2021 • Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston
Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020).
no code implementations • EMNLP 2021 • Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling in both pre-training data and model size (Adiwardana et al., 2020; Roller et al., 2020).
Ranked #1 on Visual Dialog on Wizard of Wikipedia
no code implementations • 18 Aug 2020 • Kurt Shuster, Jack Urbanek, Emily Dinan, Arthur Szlam, Jason Weston
As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013).
no code implementations • ACL 2020 • Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston
To test such models, we collect a dataset of grounded human-human conversations, where speakers are asked to play roles given a provided emotional mood or style, as the use of such traits is also a key factor in engagingness (Guo et al., 2019).
no code implementations • 22 Jun 2020 • Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson
We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.
7 code implementations • EACL 2021 • Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston
Building open-domain chatbots is a challenging area for machine learning research.
2 code implementations • ACL 2020 • Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-Lan Boureau
Being engaging, knowledgeable, and empathetic are all desirable general qualities in a conversational agent.
2 code implementations • ICLR 2020 • Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, Jason Weston
The use of deep pre-trained transformers has led to remarkable progress in a number of applications (Devlin et al., 2018).
no code implementations • 28 Dec 2019 • Da Ju, Kurt Shuster, Y-Lan Boureau, Jason Weston
As single-task accuracy on individual language and image tasks has improved substantially in the last few years, the long-term goal of a generally skilled agent that can both see and talk becomes more feasible to explore.
no code implementations • ACL 2020 • Kurt Shuster, Da Ju, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston
We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images.
7 code implementations • 22 Apr 2019 • Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, Jason Weston
The use of deep pre-trained bidirectional transformers has led to remarkable progress in a number of applications (Devlin et al., 2018).
2 code implementations • 31 Jan 2019 • Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W. black, Alexander Rudnicky, Jason Williams, Joelle Pineau, Mikhail Burtsev, Jason Weston
We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots.
2 code implementations • ICLR 2019 • Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, Jason Weston
In open-domain dialogue intelligent agents should exhibit the use of knowledge, however there are few convincing demonstrations of this to date.
3 code implementations • 2 Nov 2018 • Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston
To test such models, we collect a dataset of grounded human-human conversations, where speakers are asked to play roles given a provided emotional mood or style, as the use of such traits is also a key factor in engagingness (Guo et al., 2019).
Ranked #2 on Text Retrieval on Image-Chat
no code implementations • CVPR 2019 • Kurt Shuster, Samuel Humeau, Hexiang Hu, Antoine Bordes, Jason Weston
While such tasks are useful to verify that a machine understands the content of an image, they are not engaging to humans as captions.
1 code implementation • 9 Jul 2018 • Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston, Douwe Kiela
We introduce "Talk The Walk", the first large-scale dialogue dataset grounded in action and perception.