Search Results for author: Atri Rudra

Found 20 papers, 16 papers with code

Simple linear attention language models balance the recall-throughput tradeoff

1 code implementation • 28 Feb 2024 • Simran Arora, Sabri Eyuboglu, Michael Zhang, Aman Timalsina, Silas Alberti, Dylan Zinsley, James Zou, Atri Rudra, Christopher Ré

In this work, we explore whether we can improve language model efficiency (e. g. by reducing memory consumption) without compromising on recall.

Language Modelling Text Generation

158

Paper
Code

Zoology: Measuring and Improving Recall in Efficient Language Models

2 code implementations • 8 Dec 2023 • Simran Arora, Sabri Eyuboglu, Aman Timalsina, Isys Johnson, Michael Poli, James Zou, Atri Rudra, Christopher Ré

To close the gap between synthetics and real language, we develop a new formalization of the task called multi-query associative recall (MQAR) that better reflects actual language.

158

Paper
Code

Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture

1 code implementation • NeurIPS 2023 • Daniel Y. Fu, Simran Arora, Jessica Grogan, Isys Johnson, Sabri Eyuboglu, Armin W. Thomas, Benjamin Spector, Michael Poli, Atri Rudra, Christopher Ré

We ask: are there performant architectures that can scale sub-quadratically along sequence length and model dimension?

4k Image Classification +1

495

Paper
Code

Simple Hardware-Efficient Long Convolutions for Sequence Modeling

1 code implementation • 13 Feb 2023 • Daniel Y. Fu, Elliot L. Epstein, Eric Nguyen, Armin W. Thomas, Michael Zhang, Tri Dao, Atri Rudra, Christopher Ré

We find that a key requirement to achieving high performance is keeping the convolution kernels smooth.

Image Classification Language Modelling

836

Paper
Code

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

3 code implementations • 28 Dec 2022 • Daniel Y. Fu, Tri Dao, Khaled K. Saab, Armin W. Thomas, Atri Rudra, Christopher Ré

First, we use synthetic language modeling tasks to understand the gap between SSMs and attention.

Ranked #2 on Language Modelling on The Pile (Test perplexity metric)

8k Coreference Resolution +5

836

Paper
Code

Arithmetic Circuits, Structured Matrices and (not so) Deep Learning

no code implementations • 24 Jun 2022 • Atri Rudra

This survey presents a necessarily incomplete (and biased) overview of results at the intersection of arithmetic circuit complexity, structured matrices and deep learning.

Paper
Add Code

How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections

1 code implementation • 24 Jun 2022 • Albert Gu, Isys Johnson, Aman Timalsina, Atri Rudra, Christopher Ré

Linear time-invariant state space models (SSM) are a classical model from engineering and statistics, that have recently been shown to be very promising in machine learning through the Structured State Space sequence model (S4).

Ranked #7 on Long-range modeling on LRA

Long-range modeling

2,092

Paper
Code

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

9 code implementations • 27 May 2022 • Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré

We also extend FlashAttention to block-sparse attention, yielding an approximate attention algorithm that is faster than any existing approximate attention method.

16k 4k +3

77,674

Paper
Code

Monarch: Expressive Structured Matrices for Efficient and Accurate Training

1 code implementation • 1 Apr 2022 • Tri Dao, Beidi Chen, Nimit Sohoni, Arjun Desai, Michael Poli, Jessica Grogan, Alexander Liu, Aniruddh Rao, Atri Rudra, Christopher Ré

To address these issues, we propose a class of matrices (Monarch) that is hardware-efficient (they are parameterized as products of two block-diagonal matrices for better hardware utilization) and expressive (they can represent many commonly used transforms).

Language Modelling MRI Reconstruction

173

Paper
Code

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models

1 code implementation • ICLR 2022 • Tri Dao, Beidi Chen, Kaizhao Liang, Jiaming Yang, Zhao Song, Atri Rudra, Christopher Ré

To address this, our main insight is to optimize over a continuous superset of sparse matrices with a fixed structure known as products of butterfly matrices.

Language Modelling

173

Paper
Code

Scatterbrain: Unifying Sparse and Low-rank Attention Approximation

1 code implementation • NeurIPS 2021 • Beidi Chen, Tri Dao, Eric Winsor, Zhao Song, Atri Rudra, Christopher Ré

Recent advances in efficient Transformers have exploited either the sparsity or low-rank properties of attention matrices to reduce the computational and memory bottlenecks of modeling long sequences.

Image Generation Language Modelling

173

Paper
Code

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers

2 code implementations • NeurIPS 2021 • Albert Gu, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, Christopher Ré

Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations (NDEs) are popular families of deep learning models for time-series data, each with unique strengths and tradeoffs in modeling power and computational efficiency.

Ranked #2 on Sequential Image Classification on Sequential MNIST

Computational Efficiency Memorization +3

2,092

Paper
Code

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers

no code implementations • NeurIPS 2021 • Albert Gu, Isys Johnson, Karan Goel, Khaled Kamal Saab, Tri Dao, Atri Rudra, Christopher Re

Computational Efficiency Memorization +3

Paper
Add Code

Scatterbrain: Unifying Sparse and Low-rank Attention

1 code implementation • NeurIPS 2021 • Beidi Chen, Tri Dao, Eric Winsor, Zhao Song, Atri Rudra, Christopher Ré

Image Generation Language Modelling

173

Paper
Code

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps

2 code implementations • ICLR 2020 • Tri Dao, Nimit S. Sohoni, Albert Gu, Matthew Eichhorn, Amit Blonder, Megan Leszczynski, Atri Rudra, Christopher Ré

Modern neural network architectures use structured linear transformations, such as low-rank matrices, sparse matrices, permutations, and the Fourier transform, to improve inference speed and reduce memory usage compared to general linear maps.

Image Classification speech-recognition +1

152

Paper
Code

HiPPO: Recurrent Memory with Optimal Polynomial Projections

2 code implementations • NeurIPS 2020 • Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, Christopher Re

A central problem in learning from sequential data is representing cumulative history in an incremental fashion as more data is processed.

Ranked #8 on Sequential Image Classification on Sequential MNIST

Permuted-MNIST Sequential Image Classification +2

134

Paper
Code

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations

1 code implementation • 14 Mar 2019 • Tri Dao, Albert Gu, Matthew Eichhorn, Atri Rudra, Christopher Ré

Fast linear transforms are ubiquitous in machine learning, including the discrete Fourier transform, discrete cosine transform, and other structured transformations such as convolutions.

BIG-bench Machine Learning

152

Paper
Code

Learning Compressed Transforms with Low Displacement Rank

1 code implementation • NeurIPS 2018 • Anna T. Thomas, Albert Gu, Tri Dao, Atri Rudra, Christopher Ré

The low displacement rank (LDR) framework for structured matrices represents a matrix through two displacement operators and a low-rank residual.

Image Classification Language Modelling

Paper
Code

Hypertree Decompositions Revisited for PGMs

no code implementations • 2 Jul 2018 • Aarthy Shivram Arun, Sai Vikneshwar Mani Jayaraman, Christopher Ré, Atri Rudra

We revisit the classical problem of exact inference on probabilistic graphical models (PGMs).

Paper
Add Code

Hypertree Decompositions Revisited for PGMs

no code implementations • 5 Apr 2018 • Aarthy Shivram Arun, Sai Vikneshwar Mani Jayaraman, Christopher Ré, Atri Rudra

We revisit the classical problem of exact inference on probabilistic graphical models (PGMs).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.