Search Results for author: Hermann Kumbong

Found 2 papers, 1 papers with code

The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry

no code implementations • 6 Feb 2024 • Michael Zhang, Kush Bhatia, Hermann Kumbong, Christopher Ré

Experiments show Hedgehog recovers over 99% of standard Transformer quality in train-from-scratch and finetuned-conversion settings, outperforming prior linear attentions up to 6 perplexity points on WikiText-103 with causal GPTs, and up to 8. 7 GLUE score points on finetuned bidirectional BERTs.

Paper
Add Code

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

1 code implementation • 10 Nov 2023 • Daniel Y. Fu, Hermann Kumbong, Eric Nguyen, Christopher Ré

FlashFFTConv uses a matrix decomposition that computes the FFT using matrix multiply units and enables kernel fusion for long sequences, reducing I/O.

218

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.