Search Results for author: Kurt Shuster

Found 32 papers, 13 papers with code

Dialogue in the Wild: Learning from a Deployed Role-Playing Game with Humans and Bots

no code implementations • Findings (ACL) 2021 • Kurt Shuster, Jack Urbanek, Emily Dinan, Arthur Szlam, Jason Weston

Paper
Add Code

Talk The Walk: Navigating Grids in New York City through Grounded Dialogue

no code implementations • ICLR 2019 • Harm de Vries, Kurt Shuster, Dhruv Batra, Devi Parikh, Jason Weston, Douwe Kiela

We introduce `"Talk The Walk", the first large-scale dialogue dataset grounded in action and perception.

Navigate

Paper
Add Code

Improving Open Language Models by Learning from Organic Interactions

no code implementations • 7 Jun 2023 • Jing Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster

We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety.

Paper
Add Code

The HCI Aspects of Public Deployment of Research Chatbots: A User Study, Design Recommendations, and Open Challenges

no code implementations • 7 Jun 2023 • Morteza Behrooz, William Ngan, Joshua Lane, Giuliano Morse, Benjamin Babcock, Kurt Shuster, Mojtaba Komeili, Moya Chen, Melanie Kambadur, Y-Lan Boureau, Jason Weston

Publicly deploying research chatbots is a nuanced topic involving necessary risk-benefit analyses.

Chatbot

Paper
Add Code

Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models

no code implementations • 26 Apr 2023 • Jimmy Wei, Kurt Shuster, Arthur Szlam, Jason Weston, Jack Urbanek, Mojtaba Komeili

We compare models trained on our new dataset to existing pairwise-trained dialogue models, as well as large language models with few-shot prompting.

Paper
Add Code

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

1 code implementation • 22 Dec 2022 • Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.

Ranked #26 on Natural Language Inference on RTE

Language Modelling Meta-Learning +2

Paper
Code

Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning

no code implementations • 21 Dec 2022 • Chris Lengerich, Gabriel Synnaeve, Amy Zhang, Hugh Leather, Kurt Shuster, François Charton, Charysse Redwood

Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization.

Few-Shot Learning Language Modelling +2

Paper
Add Code

The CRINGE Loss: Learning what language not to model

no code implementations • 10 Nov 2022 • Leonard Adolphs, Tianyu Gao, Jing Xu, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston

Standard language model training employs gold human documents or human-human interaction data, and treats all training data as positive examples.

Language Modelling

Paper
Add Code

When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels

no code implementations • 28 Oct 2022 • Weiyan Shi, Emily Dinan, Kurt Shuster, Jason Weston, Jing Xu

Deployed dialogue agents have the potential to integrate human feedback to continuously improve themselves.

Chatbot

Paper
Add Code

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

2 code implementations • 5 Aug 2022 • Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.

Continual Learning

10,431

Paper
Code

DIRECTOR: Generator-Classifiers For Supervised Language Modeling

1 code implementation • 15 Jun 2022 • Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, Jason Weston

Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness and contradictions.

Language Modelling

10,433

Paper
Code

OPT: Open Pre-trained Transformer Language Models

7 code implementations • 2 May 2022 • Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer

Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning.

Ranked #2 on Stereotypical Bias Analysis on CrowS-Pairs

Decoder Hate Speech Detection +2

6,388

Paper
Code

Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion

1 code implementation • 24 Mar 2022 • Kurt Shuster, Mojtaba Komeili, Leonard Adolphs, Stephen Roller, Arthur Szlam, Jason Weston

We show that, when using SeeKeR as a dialogue model, it outperforms the state-of-the-art model BlenderBot 2 (Chen et al., 2021) on open-domain knowledge-grounded conversations for the same number of parameters, in terms of consistency, knowledge and per-turn engagingness.

Language Modelling Retrieval

10,433

Paper
Code

Am I Me or You? State-of-the-Art Dialogue Models Cannot Maintain an Identity

no code implementations • Findings (NAACL) 2022 • Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston

State-of-the-art dialogue models still often stumble with regards to factual accuracy and self-contradiction.

Paper
Add Code

Reason first, then respond: Modular Generation for Knowledge-infused Dialogue

no code implementations • 9 Nov 2021 • Leonard Adolphs, Kurt Shuster, Jack Urbanek, Arthur Szlam, Jason Weston

Large language models can produce fluent dialogue but often hallucinate factual inaccuracies.

Retrieval

Paper
Add Code

Internet-Augmented Dialogue Generation

no code implementations • ACL 2022 • Mojtaba Komeili, Kurt Shuster, Jason Weston

The largest store of continually updating knowledge on our planet can be accessed via internet search.

Dialogue Generation Retrieval

Paper
Add Code

Retrieval Augmentation Reduces Hallucination in Conversation

no code implementations • Findings (EMNLP) 2021 • Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston

Despite showing increasingly human-like conversational abilities, state-of-the-art dialogue models often suffer from factual incorrectness and hallucination of knowledge (Roller et al., 2020).

Hallucination Retrieval

Paper
Add Code

Multi-Modal Open-Domain Dialogue

no code implementations • EMNLP 2021 • Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston

Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling in both pre-training data and model size (Adiwardana et al., 2020; Roller et al., 2020).

Ranked #1 on Visual Dialog on Wizard of Wikipedia

Visual Dialog

Paper
Add Code

Deploying Lifelong Open-Domain Dialogue Learning

no code implementations • 18 Aug 2020 • Kurt Shuster, Jack Urbanek, Emily Dinan, Arthur Szlam, Jason Weston

As argued in de Vries et al. (2020), crowdsourced data has the issues of lack of naturalness and relevance to real-world use cases, while the static dataset paradigm does not allow for a model to learn from its experiences of using language (Silver et al., 2013).

Paper
Add Code

Image-Chat: Engaging Grounded Conversations

no code implementations • ACL 2020 • Kurt Shuster, Samuel Humeau, Antoine Bordes, Jason Weston

To test such models, we collect a dataset of grounded human-human conversations, where speakers are asked to play roles given a provided emotional mood or style, as the use of such traits is also a key factor in engagingness (Guo et al., 2019).

Paper
Add Code

Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

no code implementations • 22 Jun 2020 • Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the gaping holes we have not filled yet.

Continual Learning

Paper
Add Code

Recipes for building an open-domain chatbot

7 code implementations • EACL 2021 • Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston

Building open-domain chatbots is a challenging area for machine learning research.

Chatbot

125,796

Paper
Code

Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills

2 code implementations • ACL 2020 • Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-Lan Boureau

Being engaging, knowledgeable, and empathetic are all desirable general qualities in a conversational agent.

169

Paper
Code

Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring

2 code implementations • ICLR 2020 • Samuel Humeau, Kurt Shuster, Marie-Anne Lachaux, Jason Weston

The use of deep pre-trained transformers has led to remarkable progress in a number of applications (Devlin et al., 2018).

Sentence

248

Paper
Code

All-in-One Image-Grounded Conversational Agents

no code implementations • 28 Dec 2019 • Da Ju, Kurt Shuster, Y-Lan Boureau, Jason Weston

As single-task accuracy on individual language and image tasks has improved substantially in the last few years, the long-term goal of a generally skilled agent that can both see and talk becomes more feasible to explore.

Paper
Add Code

The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

no code implementations • ACL 2020 • Kurt Shuster, Da Ju, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston

We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images.