Search Results for author: Christopher Re

Found 21 papers, 12 papers with code

Metadata Shaping: A Simple Approach for Knowledge-Enhanced Language Models

1 code implementation Findings (ACL) 2022 Simran Arora, Sen Wu, Enci Liu, Christopher Re

We observe proposed methods typically start with a base LM and data that has been annotated with entity metadata, then change the model, by modifying the architecture or introducing auxiliary loss terms to better capture entity knowledge.

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

1 code implementation26 Oct 2023 Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability.

In-Context Learning

Context-Aware Meta-Learning

1 code implementation17 Oct 2023 Christopher Fifty, Dennis Duan, Ronald G. Junkins, Ehsan Amid, Jure Leskovec, Christopher Re, Sebastian Thrun

Large Language Models like ChatGPT demonstrate a remarkable capacity to learn new concepts during inference without any fine-tuning.

Few-Shot Image Classification In-Context Learning +1

Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees

1 code implementation2 Jun 2022 Jue Wang, Binhang Yuan, Luka Rimanic, Yongjun He, Tri Dao, Beidi Chen, Christopher Re, Ce Zhang

Communication compression is a crucial technique for modern distributed learning systems to alleviate their communication bottlenecks over slower networks.

Decentralized Training of Foundation Models in Heterogeneous Environments

1 code implementation2 Jun 2022 Binhang Yuan, Yongjun He, Jared Quincy Davis, Tianyi Zhang, Tri Dao, Beidi Chen, Percy Liang, Christopher Re, Ce Zhang

Our key technical contribution is a scheduling algorithm that allocates different computational "tasklets" in the training of foundation models to a group of decentralized GPU devices connected by a slow heterogeneous network.

Scheduling

Metadata Shaping: Natural Language Annotations for the Tail

1 code implementation16 Oct 2021 Simran Arora, Sen Wu, Enci Liu, Christopher Re

Since rare entities and facts are prevalent in the queries users submit to popular applications such as search and personal assistant systems, improving the ability of LMs to reliably capture knowledge over rare entities is a pressing challenge studied in significant prior work.

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers

no code implementations NeurIPS 2021 Albert Gu, Isys Johnson, Karan Goel, Khaled Kamal Saab, Tri Dao, Atri Rudra, Christopher Re

Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency.

Computational Efficiency Memorization +3

Searching for Convolutions and a More Ambitious NAS

no code implementations1 Jan 2021 Nicholas Carl Roberts, Mikhail Khodak, Tri Dao, Liam Li, Nina Balcan, Christopher Re, Ameet Talwalkar

An important goal of neural architecture search (NAS) is to automate-away the design of neural networks on new tasks in under-explored domains, thus helping to democratize machine learning.

Neural Architecture Search

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training

no code implementations ICLR 2021 Beidi Chen, Zichang Liu, Binghui Peng, Zhaozhuo Xu, Jonathan Lingjie Li, Tri Dao, Zhao Song, Anshumali Shrivastava, Christopher Re

Recent advances by practitioners in the deep learning community have breathed new life into Locality Sensitive Hashing (LSH), using it to reduce memory and time bottlenecks in neural network (NN) training.

Efficient Neural Network Language Modelling +2

Cut out the annotator, keep the cutout: better segmentation with weak supervision

no code implementations ICLR 2021 Sarah Hooper, Michael Wornow, Ying Hang Seah, Peter Kellman, Hui Xue, Frederic Sala, Curtis Langlotz, Christopher Re

We propose a framework that fuses limited label learning and weak supervision for segmentation tasks, enabling users to train high-performing segmentation CNNs with very few hand-labeled training points.

Data Augmentation Few-Shot Learning +4

Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation

1 code implementation20 Oct 2020 Laurel Orr, Megan Leszczynski, Simran Arora, Sen Wu, Neel Guha, Xiao Ling, Christopher Re

A challenge for named entity disambiguation (NED), the task of mapping textual mentions to entities in a knowledge base, is how to disambiguate entities that appear rarely in the training data, termed tail entities.

 Ranked #1 on Entity Disambiguation on AIDA-CoNLL (Micro-F1 metric)

Entity Disambiguation Relation Extraction

Leveraging Organizational Resources to Adapt Models to New Data Modalities

no code implementations23 Aug 2020 Sahaana Suri, Raghuveer Chanda, Neslihan Bulut, Pradyumna Narayana, Yemao Zeng, Peter Bailis, Sugato Basu, Girija Narlikar, Christopher Re, Abishek Sethi

As applications in large organizations evolve, the machine learning (ML) models that power them must adapt the same predictive tasks to newly arising data modalities (e. g., a new video content launch in a social media application requires existing text or image models to extend to video).

Scene Graph Prediction with Limited Labels

1 code implementation ICCV 2019 Vincent S. Chen, Paroma Varma, Ranjay Krishna, Michael Bernstein, Christopher Re, Li Fei-Fei

All scene graph models to date are limited to training on a small set of visual relationships that have thousands of training labels each.

Knowledge Base Completion Question Answering +2

Improving Sample Complexity with Observational Supervision

no code implementations ICLR Workshop LLD 2019 Khaled Saab, Jared Dunnmon, Alexander Ratner, Daniel Rubin, Christopher Re

Supervised machine learning models for high-value computer vision applications such as medical image classification often require large datasets labeled by domain experts, which are slow to collect, expensive to maintain, and static with respect to changes in the data distribution.

Image Classification Medical Image Classification

Infrastructure for Usable Machine Learning: The Stanford DAWN Project

no code implementations22 May 2017 Peter Bailis, Kunle Olukotun, Christopher Re, Matei Zaharia

Despite incredible recent advances in machine learning, building machine learning applications remains prohibitively time-consuming and expensive for all but the best-trained, best-funded engineering organizations.

BIG-bench Machine Learning

Factoring nonnegative matrices with linear programs

1 code implementation NeurIPS 2012 Victor Bittorf, Benjamin Recht, Christopher Re, Joel A. Tropp

The constraints are chosen to ensure that the matrix C selects features; these features can then be used to find a low-rank NMF of X.

Beneath the valley of the noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences

3 code implementations19 Feb 2012 Benjamin Recht, Christopher Re

We detail the consequences of this inequality for stochastic gradient descent and the randomized Kaczmarz algorithm for solving linear systems.

Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

no code implementations NeurIPS 2011 Benjamin Recht, Christopher Re, Stephen Wright, Feng Niu

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-art performance on a variety of machine learning tasks.

HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent

5 code implementations28 Jun 2011 Feng Niu, Benjamin Recht, Christopher Re, Stephen J. Wright

Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-art performance on a variety of machine learning tasks.

Cannot find the paper you are looking for? You can Submit a new open access paper.