1 code implementation • 6 Feb 2024 • Ashok Vardhan Makkuva, Marco Bondaschi, Adway Girish, Alliot Nagle, Martin Jaggi, Hyeji Kim, Michael Gastpar
Inspired by the Markovianity of natural languages, we model the data as a Markovian source and utilize this framework to systematically study the interplay between the data-distributional properties, the transformer architecture, the learnt distribution, and the final model performance.
no code implementations • 6 Feb 2024 • Marco Bondaschi, Michael Gastpar
Large language models (LLMs) have recently gained much popularity due to their surprising ability at generating human-like English sentences.
no code implementations • 19 Oct 2023 • Ashok Vardhan Makkuva, Marco Bondaschi, Thijs Vogels, Martin Jaggi, Hyeji Kim, Michael C. Gastpar
On the latter, we obtain $50$-$64 \%$ improvement in perplexity over our baselines for noisy channels.
no code implementations • 25 Jan 2021 • Marco Bondaschi, Albert Guillén i Fàbregas, Marco Dalai
We derive an upper bound on the reliability function of mismatched decoding for zero-rate codes.
Information Theory Information Theory