1 code implementation • 18 May 2024 • Victor Agostinelli, Sanghyun Hong, Lizhong Chen
A promising approach to preserving model performance in linearized transformers is to employ position-based re-weighting functions.
no code implementations • 16 May 2024 • Matthew Raffel, Victor Agostinelli, Lizhong Chen
Large language models (LLMs) have achieved state-of-the-art performance in various language processing tasks, motivating their adoption in simultaneous translation.
1 code implementation • 7 Dec 2023 • Victor Agostinelli, Max Wild, Matthew Raffel, Kazi Ahmed Asif Fuad, Lizhong Chen
Large language models (LLMs) with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks.
no code implementations • 24 Jun 2023 • Tianhong Huang, Victor Agostinelli, Lizhong Chen
Compactness in deep learning can be critical to a model's viability in low-resource applications, and a common approach to extreme model compression is quantization.
no code implementations • 17 Apr 2023 • Victor Agostinelli, Lizhong Chen
Various natural language processing (NLP) tasks necessitate models that are efficient and small based on their ultimate application at the edge or in other resource-constrained environments.