Search Results for author: Victor Agostinelli

Found 3 papers, 1 papers with code

Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models

1 code implementation • 7 Dec 2023 • Victor Agostinelli, Max Wild, Matthew Raffel, Kazi Ahmed Asif Fuad, Lizhong Chen

Large language models (LLMs) with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks.

Machine Translation NMT +1

Paper
Code

Partitioning-Guided K-Means: Extreme Empty Cluster Resolution for Extreme Model Compression

no code implementations • 24 Jun 2023 • Tianhong Huang, Victor Agostinelli, Lizhong Chen

Compactness in deep learning can be critical to a model's viability in low-resource applications, and a common approach to extreme model compression is quantization.

Model Compression Quantization

Paper
Add Code

Improving Autoregressive NLP Tasks via Modular Linearized Attention

no code implementations • 17 Apr 2023 • Victor Agostinelli, Lizhong Chen

Various natural language processing (NLP) tasks necessitate models that are efficient and small based on their ultimate application at the edge or in other resource-constrained environments.

Computational Efficiency Machine Translation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.