Search Results for author: Hermann Kumbong

Found 2 papers, 1 papers with code

The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry

no code implementations6 Feb 2024 Michael Zhang, Kush Bhatia, Hermann Kumbong, Christopher Ré

Experiments show Hedgehog recovers over 99% of standard Transformer quality in train-from-scratch and finetuned-conversion settings, outperforming prior linear attentions up to 6 perplexity points on WikiText-103 with causal GPTs, and up to 8. 7 GLUE score points on finetuned bidirectional BERTs.

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

1 code implementation10 Nov 2023 Daniel Y. Fu, Hermann Kumbong, Eric Nguyen, Christopher Ré

FlashFFTConv uses a matrix decomposition that computes the FFT using matrix multiply units and enables kernel fusion for long sequences, reducing I/O.

Cannot find the paper you are looking for? You can Submit a new open access paper.