1 code implementation • 7 Dec 2023 • Victor Agostinelli, Max Wild, Matthew Raffel, Kazi Ahmed Asif Fuad, Lizhong Chen
Large language models (LLMs) with billions of parameters and pretrained on massive amounts of data are now capable of near or better than state-of-the-art performance in a variety of downstream natural language processing tasks.
1 code implementation • 3 Jul 2023 • Matthew Raffel, Lizhong Chen
Experiments on the MuST-C dataset show that the Implicit Memory Transformer provides a substantial speedup on the encoder forward pass with nearly identical translation quality when compared with the state-of-the-art approach that employs both left context and memory banks.
1 code implementation • 3 Jul 2023 • Matthew Raffel, Drew Penney, Lizhong Chen
Transformer models using segment-based processing have been an effective architecture for simultaneous speech translation.