Search Results for author: Charlie Snell

Found 9 papers, 6 papers with code

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

1 code implementation • 30 Nov 2023 • Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine

Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms.

reinforcement-learning Text Generation

Paper
Code

The False Promise of Imitating Proprietary LLMs

1 code implementation • 25 May 2023 • Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao liu, Pieter Abbeel, Sergey Levine, Dawn Song

This approach looks to cheaply imitate the proprietary model's capabilities using a weaker open-source model.

Language Modelling

1,089

Paper
Code

Learning by Distilling Context

no code implementations • 30 Sep 2022 • Charlie Snell, Dan Klein, Ruiqi Zhong

We show that context distillation is a general method to train language models, and it can effectively internalize 3 types of training signals.

Language Modelling Text-To-SQL

Paper
Add Code

Offline RL for Natural Language Generation with Implicit Language Q Learning

1 code implementation • 5 Jun 2022 • Charlie Snell, Ilya Kostrikov, Yi Su, Mengjiao Yang, Sergey Levine

Large language models distill broad knowledge from text corpora.

Language Modelling Offline RL +2

190

Paper
Code

Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL

1 code implementation • 25 May 2022 • Ruiqi Zhong, Charlie Snell, Dan Klein, Jason Eisner

We introduce APEL, a framework in which non-programmers select among candidate programs generated by a seed semantic parser (e. g., Codex).

Bayesian Inference Text-To-SQL

Paper
Code

Context-Aware Language Modeling for Goal-Oriented Dialogue Systems

no code implementations • Findings (NAACL) 2022 • Charlie Snell, Mengjiao Yang, Justin Fu, Yi Su, Sergey Levine

Goal-oriented dialogue systems face a trade-off between fluent language generation and task-specific control.

Goal-Oriented Dialogue Systems Language Modelling +2

Paper
Add Code

Describing Differences between Text Distributions with Natural Language

1 code implementation • 28 Jan 2022 • Ruiqi Zhong, Charlie Snell, Dan Klein, Jacob Steinhardt

We then re-rank the descriptions by checking how often they hold on a larger set of samples with a learned verifier.

Binary Classification Re-Ranking

Paper
Code

Approximating How Single Head Attention Learns

1 code implementation • 13 Mar 2021 • Charlie Snell, Ruiqi Zhong, Dan Klein, Jacob Steinhardt

Our approximation explains why models sometimes attend to salient words, and inspires a toy example where a multi-head attention model can overcome the above hard training distribution by improving learning dynamics rather than expressiveness.

Paper
Code

Understanding Attention Training via Output Relevance

no code implementations • 16 Aug 2020 • Charlie Snell, Ruiqi Zhong, Jacob Steinhardt, Dan Klein

If we ablate attention by fixing it to uniform, the output relevance still correlates with the attention of a normally trained model; but if we instead ablate output relevance, attention cannot be learned.

Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.