Search Results for author: Aaron Mueller

Found 24 papers, 12 papers with code

[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

no code implementations • 9 Apr 2024 • Leshem Choshen, Ryan Cotterell, Michael Y. Hu, Tal Linzen, Aaron Mueller, Candace Ross, Alex Warstadt, Ethan Wilcox, Adina Williams, Chengxu Zhuang

The big changes for this year's competition are as follows: First, we replace the loose track with a paper track, which allows (for example) non-model-based submissions, novel cognitively-inspired benchmarks, or analysis techniques.

Paper
Add Code

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

1 code implementation • 28 Mar 2024 • Samuel Marks, Can Rager, Eric J. Michaud, Yonatan Belinkov, David Bau, Aaron Mueller

We introduce methods for discovering and applying sparse feature circuits.

Language Modelling

Paper
Code

In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax

1 code implementation • 13 Nov 2023 • Aaron Mueller, Albert Webson, Jackson Petty, Tal Linzen

In-context learning (ICL) is now a common method for teaching large language models (LLMs) new tasks: given labeled examples in the input context, the LLM learns to perform the task without weight updates.

In-Context Learning Out-of-Distribution Generalization

Paper
Code

Function Vectors in Large Language Models

no code implementations • 23 Oct 2023 • Eric Todd, Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, David Bau

Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV).

In-Context Learning

Paper
Add Code

Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

no code implementations • 30 Jun 2023 • Aaron Mueller, Kanika Narang, Lambert Mathias, Qifan Wang, Hamed Firooz

Meta-training allows one to leverage smaller models for few-shot generalization in a domain-general and task-agnostic manner; however, these methods alone results in models that may not have sufficient parameterization or knowledge to adapt quickly to a large variety of tasks.

Few-Shot Learning QNLI +3

Paper
Add Code

Inverse Scaling: When Bigger Isn't Better

no code implementations • 15 Jun 2023 • Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, Euan McLean, Aaron Kirtland, Alexis Ross, Alisa Liu, Andrew Gritsevskiy, Daniel Wurgaft, Derik Kauffman, Gabriel Recchia, Jiacheng Liu, Joe Cavanagh, Max Weiss, Sicong Huang, The Floating Droid, Tom Tseng, Tomasz Korbak, Xudong Shen, Yuhui Zhang, Zhengping Zhou, Najoung Kim, Samuel R. Bowman, Ethan Perez

Here, we present evidence for the claim that LMs may show inverse scaling, or worse task performance with increased scale, e. g., due to flaws in the training objective and data.

Paper
Add Code

How to Plant Trees in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases

1 code implementation • 31 May 2023 • Aaron Mueller, Tal Linzen

Accurate syntactic representations are essential for robust generalization in natural language.

Inductive Bias Language Acquisition

Paper
Code

Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

1 code implementation • 27 Jan 2023 • Alex Warstadt, Leshem Choshen, Aaron Mueller, Adina Williams, Ethan Wilcox, Chengxu Zhuang

In partnership with CoNLL and CMCL, we provide a platform for approaches to pretraining with a limited-size corpus sourced from data inspired by the input to children.

Language Acquisition Language Modelling +1

Paper
Code

Language model acceptability judgements are not always robust to context

no code implementations • 18 Dec 2022 • Koustuv Sinha, Jon Gauthier, Aaron Mueller, Kanishka Misra, Keren Fuentes, Roger Levy, Adina Williams

In this paper, we investigate the stability of language models' performance on targeted syntactic evaluations as we vary properties of the input context: the length of the context, the types of syntactic phenomena it contains, and whether or not there are violations of grammaticality.

In-Context Learning Language Modelling +1

Paper
Add Code

Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models

1 code implementation • 25 Oct 2022 • Aaron Mueller, Yu Xia, Tal Linzen

However, much of this analysis has focused on monolingual models, and analyses of multilingual models have employed correlational methods that are confounded by the choice of probing tasks.

counterfactual

Paper
Code

What Do NLP Researchers Believe? Results of the NLP Community Metasurvey

no code implementations • 26 Aug 2022 • Julian Michael, Ari Holtzman, Alicia Parrish, Aaron Mueller, Alex Wang, Angelica Chen, Divyam Madaan, Nikita Nangia, Richard Yuanzhe Pang, Jason Phang, Samuel R. Bowman

We present the results of the NLP Community Metasurvey.

Ethics Inductive Bias

Paper
Add Code

Label Semantic Aware Pre-training for Few-shot Text Classification

1 code implementation • ACL 2022 • Aaron Mueller, Jason Krone, Salvatore Romeo, Saab Mansour, Elman Mansimov, Yi Zhang, Dan Roth

Label semantic aware systems have leveraged this information for improved text classification performance during fine-tuning and prediction.

Few-Shot Text Classification Sentence +2

Paper
Code

Coloring the Blank Slate: Pre-training Imparts a Hierarchical Inductive Bias to Sequence-to-sequence Models

1 code implementation • Findings (ACL) 2022 • Aaron Mueller, Robert Frank, Tal Linzen, Luheng Wang, Sebastian Schuster

We find that pre-trained seq2seq models generalize hierarchically when performing syntactic transformations, whereas models trained from scratch on syntactic transformations do not.

Inductive Bias

Paper
Code

Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models

1 code implementation • ACL 2021 • Matthew Finlayson, Aaron Mueller, Sebastian Gehrmann, Stuart Shieber, Tal Linzen, Yonatan Belinkov

Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts.

Sentence

Paper
Code

Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling

1 code implementation • NAACL 2021 • Aaron Mueller, Mark Dredze

Neural topic models can augment or replace bag-of-words inputs with the learned representations of deep pre-trained transformer-based word prediction models.

Classification Cross-Lingual Transfer +3

Paper
Code

Decoding Methods for Neural Narrative Generation

1 code implementation • ACL (GEM) 2021 • Alexandra DeLucia, Aaron Mueller, Xiang Lisa Li, João Sedoc

Narrative generation is an open-ended NLP task in which a model generates a story given a prompt.

Response Generation

Paper
Code

Demographic Representation and Collective Storytelling in the Me Too Twitter Hashtag Activism Movement

no code implementations • 13 Oct 2020 • Aaron Mueller, Zach Wood-Doughty, Silvio Amir, Mark Dredze, Alicia L. Nobles

The #MeToo movement on Twitter has drawn attention to the pervasive nature of sexual harassment and violence.

Paper
Add Code

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

2 code implementations • ACL 2020 • Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou, Natalia Talmina, Tal Linzen

On other constructions, agreement accuracy was generally higher in languages with richer morphology.

Paper
Code

The Johns Hopkins University Bible Corpus: 1600+ Tongues for Typological Exploration

no code implementations • LREC 2020 • Arya D. McCarthy, Rachel Wicks, Dylan Lewis, Aaron Mueller, Winston Wu, Oliver Adams, Garrett Nicolai, Matt Post, David Yarowsky

The corpus consists of over 4000 unique translations of the Christian Bible and counting.

Paper
Add Code

An Analysis of Massively Multilingual Neural Machine Translation for Low-Resource Languages

no code implementations • LREC 2020 • Aaron Mueller, Garrett Nicolai, Arya D. McCarthy, Dylan Lewis, Winston Wu, David Yarowsky

We find that best practices in this domain are highly language-specific: adding more languages to a training set is often better, but too many harms performance{---}the best number depends on the source language.

Low-Resource Neural Machine Translation Translation

Paper
Add Code

Fine-grained Morphosyntactic Analysis and Generation Tools for More Than One Thousand Languages

no code implementations • LREC 2020 • Garrett Nicolai, Dylan Lewis, Arya D. McCarthy, Aaron Mueller, Winston Wu, David Yarowsky

Exploiting the broad translation of the Bible into the world{'}s languages, we train and distribute morphosyntactic tools for approximately one thousand languages, vastly outstripping previous distributions of tools devoted to the processing of inflectional morphology.

Translation

Paper
Add Code

Modeling Color Terminology Across Thousands of Languages

1 code implementation • IJCNLP 2019 • Arya D. McCarthy, Winston Wu, Aaron Mueller, Bill Watson, David Yarowsky

There is an extensive history of scholarship into what constitutes a "basic" color term, as well as a broadly attested acquisition sequence of basic color terms across many languages, as articulated in the seminal work of Berlin and Kay (1969).

Paper
Code

Quantity doesn't buy quality syntax with neural language models

no code implementations • IJCNLP 2019 • Marten van Schijndel, Aaron Mueller, Tal Linzen

We investigate to what extent these shortcomings can be mitigated by increasing the size of the network and the corpus on which it is trained.

Paper
Add Code

Sentence-Level Adaptation for Low-Resource Neural Machine Translation

no code implementations • WS 2019 • Aaron Mueller, Yash Kumar Lal

Low-Resource Neural Machine Translation Sentence +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.