Search Results for author: Alexander Wettig

Found 9 papers, 8 papers with code

Language Models as Science Tutors

1 code implementation • 16 Feb 2024 • Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, Sebastian Mizera, Toni Annala, Max Jameson Aragon, Arturo Rodríguez Fanlo, Simon Frieder, Simon Machado, Akshara Prabhakar, Ellie Thieu, Jiachen T. Wang, ZiRui Wang, Xindi Wu, Mengzhou Xia, Wenhan Jia, Jiatong Yu, Jun-Jie Zhu, Zhiyong Jason Ren, Sanjeev Arora, Danqi Chen

We use TutorChat to fine-tune Llemma models with 7B and 34B parameters.

GSM8K Math +1

Paper
Code

QuRating: Selecting High-Quality Data for Training Language Models

1 code implementation • 15 Feb 2024 • Alexander Wettig, Aatmik Gupta, Saumya Malik, Danqi Chen

Selecting high-quality pre-training data is important for creating capable language models, but existing methods rely on simple heuristics.

In-Context Learning

Paper
Code

Poisoning Retrieval Corpora by Injecting Adversarial Passages

1 code implementation • 29 Oct 2023 • Zexuan Zhong, Ziqing Huang, Alexander Wettig, Danqi Chen

Dense retrievers have achieved state-of-the-art performance in various information retrieval tasks, but to what extent can they be safely deployed in real-world applications?

Information Retrieval Natural Questions +1

Paper
Code

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

no code implementations • 10 Oct 2023 • Carlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan

We find real-world software engineering to be a rich, sustainable, and challenging testbed for evaluating the next generation of language models.

Ranked #2 on Bug fixing on SWE-bench

Bug fixing Code Generation +1

Paper
Add Code

Adapting Language Models to Compress Contexts

1 code implementation • 24 May 2023 • Alexis Chevalier, Alexander Wettig, Anirudh Ajith, Danqi Chen

Transformer-based language models (LMs) are powerful and widely-applicable tools, but their usefulness is constrained by a finite context window and the expensive computational cost of processing long text documents.

In-Context Learning Language Modelling +3