Search Results for author: Asa Cooper Stickland

Found 11 papers, 7 papers with code

Regularising Fisher Information Improves Cross-lingual Generalisation

no code implementations • EMNLP (MRL) 2021 • Asa Cooper Stickland, Iain Murray

Many recent works use ‘consistency regularisation’ to improve the generalisation of fine-tuned pre-trained models, both multilingual and English-only.

Memorization

Paper
Add Code

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

1 code implementation • 20 Nov 2023 • David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, Samuel R. Bowman

We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry.

Multiple-choice

Paper
Code

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

2 code implementations • 21 Sep 2023 • Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans

If a model is trained on a sentence of the form "A is B", it will not automatically generalize to the reverse direction "B is A".

Data Augmentation Sentence

249

Paper
Code

Taken out of context: On measuring situational awareness in LLMs

1 code implementation • 1 Sep 2023 • Lukas Berglund, Asa Cooper Stickland, Mikita Balesni, Max Kaufmann, Meg Tong, Tomasz Korbak, Daniel Kokotajlo, Owain Evans

At test time, we assess whether the model can pass the test.

Data Augmentation In-Context Learning

Paper
Code

Robustification of Multilingual Language Models to Real-world Noise in Crosslingual Zero-shot Settings with Robust Contrastive Pretraining

1 code implementation • 10 Oct 2022 • Asa Cooper Stickland, Sailik Sengupta, Jason Krone, Saab Mansour, He He

To benchmark the performance of pretrained multilingual language models, we construct noisy datasets covering five languages and four NLP tasks and observe a clear gap in the performance between clean and noisy data in the zero-shot cross-lingual setting.

Data Augmentation Pretrained Multilingual Language Models +1

Paper
Code

When does Parameter-Efficient Transfer Learning Work for Machine Translation?

1 code implementation • 23 May 2022 • Ahmet Üstün, Asa Cooper Stickland

We find that using PEFTs with a larger pre-trained model outperforms full fine-tuning with a smaller model, and for smaller training data sizes, PEFTs outperform full fine-tuning for the same pre-trained model.

Machine Translation Transfer Learning +1

Paper
Code

Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters

no code implementations • WMT (EMNLP) 2021 • Asa Cooper Stickland, Alexandre Bérard, Vassilina Nikoulina

In this work we study the compositionality of language and domain adapters in the context of Machine Translation.

Cross-Lingual Transfer Domain Adaptation +3

Paper
Add Code

Deep Transformers with Latent Depth

1 code implementation • NeurIPS 2020 • Xi-An Li, Asa Cooper Stickland, Yuqing Tang, Xiang Kong

As an extension of this framework, we propose a novel method to train one shared Transformer network for multilingual machine translation with different layer selection posteriors for each language pair.

Language Modelling Machine Translation +2

29,286

Paper
Code

Diverse Ensembles Improve Calibration

no code implementations • 8 Jul 2020 • Asa Cooper Stickland, Iain Murray

Modern deep neural networks can produce badly calibrated predictions, especially when train and test distributions are mismatched.

Data Augmentation

Paper
Add Code

Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation

no code implementations • EACL 2021 • Asa Cooper Stickland, Xi-An Li, Marjan Ghazvininejad

For BART we get the best performance by freezing most of the model parameters, and adding extra positional embeddings.

Machine Translation Translation

Paper
Add Code

BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

2 code implementations • 7 Feb 2019 • Asa Cooper Stickland, Iain Murray

Multi-task learning shares information between related tasks, sometimes reducing the number of parameters required.

Multi-Task Learning Natural Language Inference +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.