Search Results for author: Dániel Simig

Found 2 papers, 1 papers with code

MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

no code implementations • NeurIPS 2023 • Lili Yu, Dániel Simig, Colin Flaherty, Armen Aghajanyan, Luke Zettlemoyer, Mike Lewis

Autoregressive transformers are spectacular models for short sequences but scale poorly to long sequences such as high-resolution images, podcasts, code, or books.

Density Estimation Language Modelling

Paper
Add Code

SemDeDup: Data-efficient learning at web-scale through semantic deduplication

1 code implementation • 16 Mar 2023 • Amro Abbas, Kushal Tirumala, Dániel Simig, Surya Ganguli, Ari S. Morcos

Analyzing a subset of LAION, we show that SemDeDup can remove 50% of the data with minimal performance loss, effectively halving training time.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.