Search Results for author: Victor Agostinelli

Found 5 papers, 2 papers with code

LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions

1 code implementation • 18 May 2024 • Victor Agostinelli, Sanghyun Hong, Lizhong Chen

A promising approach to preserving model performance in linearized transformers is to employ position-based re-weighting functions.

Language Modelling Simultaneous Speech-to-Text Translation

Paper
Code

Simultaneous Masking, Not Prompting Optimization: A Paradigm Shift in Fine-tuning LLMs for Simultaneous Translation

no code implementations • 16 May 2024 • Matthew Raffel, Victor Agostinelli, Lizhong Chen

Large language models (LLMs) have achieved state-of-the-art performance in various language processing tasks, motivating their adoption in simultaneous translation.

Data Augmentation Translation

Paper
Add Code

Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models

1 code implementation • 7 Dec 2023 • Victor Agostinelli, Max Wild, Matthew Raffel, Kazi Ahmed Asif Fuad, Lizhong Chen

Large language models (LLMs) with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks.

Machine Translation NMT +1

Paper
Code

Partitioning-Guided K-Means: Extreme Empty Cluster Resolution for Extreme Model Compression

no code implementations • 24 Jun 2023 • Tianhong Huang, Victor Agostinelli, Lizhong Chen

Compactness in deep learning can be critical to a model's viability in low-resource applications, and a common approach to extreme model compression is quantization.

Model Compression Quantization

Paper
Add Code

Improving Autoregressive NLP Tasks via Modular Linearized Attention

no code implementations • 17 Apr 2023 • Victor Agostinelli, Lizhong Chen

Various natural language processing (NLP) tasks necessitate models that are efficient and small based on their ultimate application at the edge or in other resource-constrained environments.

Computational Efficiency Machine Translation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.