1 code implementation • 16 Feb 2024 • Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, Sebastian Mizera, Toni Annala, Max Jameson Aragon, Arturo Rodríguez Fanlo, Simon Frieder, Simon Machado, Akshara Prabhakar, Ellie Thieu, Jiachen T. Wang, ZiRui Wang, Xindi Wu, Mengzhou Xia, Wenhan Jia, Jiatong Yu, Jun-Jie Zhu, Zhiyong Jason Ren, Sanjeev Arora, Danqi Chen
We use TutorChat to fine-tune Llemma models with 7B and 34B parameters.
1 code implementation • 15 Feb 2024 • Alexander Wettig, Aatmik Gupta, Saumya Malik, Danqi Chen
Selecting high-quality pre-training data is important for creating capable language models, but existing methods rely on simple heuristics.
1 code implementation • 29 Oct 2023 • Zexuan Zhong, Ziqing Huang, Alexander Wettig, Danqi Chen
Dense retrievers have achieved state-of-the-art performance in various information retrieval tasks, but to what extent can they be safely deployed in real-world applications?
no code implementations • 10 Oct 2023 • Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan
We find real-world software engineering to be a rich, sustainable, and challenging testbed for evaluating the next generation of language models.
Ranked #2 on Bug fixing on SWE-bench
1 code implementation • 24 May 2023 • Alexis Chevalier, Alexander Wettig, Anirudh Ajith, Danqi Chen
Transformer-based language models (LMs) are powerful and widely-applicable tools, but their usefulness is constrained by a finite context window and the expensive computational cost of processing long text documents.
1 code implementation • 20 Oct 2022 • Dan Friedman, Alexander Wettig, Danqi Chen
Many NLP datasets have been found to contain shortcuts: simple decision rules that achieve surprisingly high accuracy.
1 code implementation • 11 Oct 2022 • Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora
It has become standard to solve NLP tasks by fine-tuning pre-trained language models (LMs), especially in low-data settings.
1 code implementation • 16 Feb 2022 • Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
In this work, we revisit this important choice of MLM pre-training.
1 code implementation • EMNLP 2021 • Jinhyuk Lee, Alexander Wettig, Danqi Chen
Dense retrieval methods have shown great promise over sparse retrieval methods in a range of NLP problems.