Search Results for author: Patrick Fernandes

Found 20 papers, 11 papers with code

CMU’s IWSLT 2022 Dialect Speech Translation System

no code implementations • IWSLT (ACL) 2022 • Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe

We use additional paired Modern Standard Arabic data (MSA) to directly improve the speech recognition (ASR) and machine translation (MT) components of our cascaded systems.

Decoder Knowledge Distillation +4

Paper
Add Code

Is Context Helpful for Chat Translation Evaluation?

no code implementations • 13 Mar 2024 • Sweta Agrawal, Amin Farajian, Patrick Fernandes, Ricardo Rei, André F. T. Martins

Our findings show that augmenting neural learned metrics with contextual information helps improve correlation with human judgments in the reference-free scenario and when evaluating translations in out-of-English settings.

Language Modelling Large Language Model +2

Paper
Add Code

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

1 code implementation • 27 Feb 2024 • Duarte M. Alves, José Pombal, Nuno M. Guerreiro, Pedro H. Martins, João Alves, Amin Farajian, Ben Peters, Ricardo Rei, Patrick Fernandes, Sweta Agrawal, Pierre Colombo, José G. C. de Souza, André F. T. Martins

While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task.

Language Modelling Large Language Model +1

Paper
Code

CroissantLLM: A Truly Bilingual French-English Language Model

1 code implementation • 1 Feb 2024 • Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F. T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo

We introduce CroissantLLM, a 1. 3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware.

Language Modelling Large Language Model

Paper
Code

Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues

1 code implementation • 20 Nov 2023 • Sumire Honda, Patrick Fernandes, Chrysoula Zerva

We make use of Conditional Cross-Mutual Information (CXMI) to explore how much of the context the model uses and generalise CXMI to study the impact of the extra-sentential context.

Machine Translation NMT +2

Paper
Code

Aligning Neural Machine Translation Models: Human Feedback in Training and Inference

no code implementations • 15 Nov 2023 • Miguel Moura Ramos, Patrick Fernandes, António Farinhas, André F. T. Martins

A core ingredient in RLHF's success in aligning and improving large language models (LLMs) is its reward model, trained using human feedback on model outputs.

Language Modelling Machine Translation +1

Paper
Add Code

The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation

no code implementations • 14 Aug 2023 • Patrick Fernandes, Daniel Deutsch, Mara Finkelstein, Parker Riley, André F. T. Martins, Graham Neubig, Ankush Garg, Jonathan H. Clark, Markus Freitag, Orhan Firat

Automatic evaluation of machine translation (MT) is a critical tool driving the rapid iterative development of MT systems.

In-Context Learning Informativeness +1

Paper
Add Code

Multi-Dimensional Evaluation of Text Summarization with In-Context Learning

1 code implementation • 1 Jun 2023 • Sameer Jain, Vaishakh Keshava, Swarnashree Mysore Sathyendra, Patrick Fernandes, PengFei Liu, Graham Neubig, Chunting Zhou

Most frameworks that perform such multi-dimensional evaluation require training on large manually or synthetically generated datasets.

In-Context Learning Text Generation +1

Paper
Code

Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation

1 code implementation • 17 May 2023 • Markus Freitag, Behrooz Ghorbani, Patrick Fernandes

Recent advances in machine translation (MT) have shown that Minimum Bayes Risk (MBR) decoding can be a powerful alternative to beam search decoding, especially when combined with neural-based utility functions.

Machine Translation

Paper
Code

Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

no code implementations • 1 May 2023 • Patrick Fernandes, Aman Madaan, Emmy Liu, António Farinhas, Pedro Henrique Martins, Amanda Bertsch, José G. C. de Souza, Shuyan Zhou, Tongshuang Wu, Graham Neubig, André F. T. Martins

Many recent advances in natural language generation have been fueled by training large language models on internet-scale data.

Text Generation

Paper
Add Code

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

1 code implementation • 10 Apr 2023 • Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe

ESPnet-ST-v2 is a revamp of the open-source ESPnet-ST toolkit necessitated by the broadening interests of the spoken language translation community.

Benchmarking Simultaneous Speech-to-Text Translation +2

7,917

Paper
Code

Scaling Laws for Multilingual Neural Machine Translation

no code implementations • 19 Feb 2023 • Patrick Fernandes, Behrooz Ghorbani, Xavier Garcia, Markus Freitag, Orhan Firat

Through a novel joint scaling law formulation, we compute the effective number of parameters allocated to each language pair and examine the role of language similarity in the scaling behavior of our models.

Machine Translation Translation

Paper
Add Code

A Multi-dimensional Evaluation of Tokenizer-free Multilingual Pretrained Models

no code implementations • 13 Oct 2022 • Jimin Sun, Patrick Fernandes, Xinyi Wang, Graham Neubig

Recent work on tokenizer-free multilingual pretrained models show promising results in improving cross-lingual transfer and reducing engineering overhead (Clark et al., 2022; Xue et al., 2022).

Cross-Lingual Transfer

Paper
Add Code

Quality-Aware Decoding for Neural Machine Translation

1 code implementation • NAACL 2022 • Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, André F. T. Martins

Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search.

Machine Translation NMT +1

Paper
Code

Learning to Scaffold: Optimizing Model Explanations for Teaching

1 code implementation • 22 Apr 2022 • Patrick Fernandes, Marcos Treviso, Danish Pruthi, André F. T. Martins, Graham Neubig

In this work, leveraging meta-learning techniques, we extend this idea to improve the quality of the explanations themselves, specifically by optimizing explanations such that student models more effectively learn to simulate the original model.

Meta-Learning

Paper
Code

Predicting Attention Sparsity in Transformers

no code implementations • spnlp (ACL) 2022 • Marcos Treviso, António Góis, Patrick Fernandes, Erick Fonseca, André F. T. Martins

Transformers' quadratic complexity with respect to the input sequence length has motivated a body of work on efficient sparse approximations to softmax.

Decoder Language Modelling +4

Paper
Add Code

When Does Translation Require Context? A Data-driven, Multilingual Exploration

no code implementations • 15 Sep 2021 • Patrick Fernandes, Kayo Yin, Emmy Liu, André F. T. Martins, Graham Neubig

Although proper handling of discourse significantly contributes to the quality of machine translation (MT), these improvements are not adequately measured in common translation quality metrics.

Machine Translation Translation

Paper
Add Code

Do Context-Aware Translation Models Pay the Right Attention?

1 code implementation • ACL 2021 • Kayo Yin, Patrick Fernandes, Danish Pruthi, Aditi Chaudhary, André F. T. Martins, Graham Neubig

Are models paying large amounts of attention to the same context?

Machine Translation Translation

Paper
Code

Measuring and Increasing Context Usage in Context-Aware Machine Translation

1 code implementation • ACL 2021 • Patrick Fernandes, Kayo Yin, Graham Neubig, André F. T. Martins

Recent work in neural machine translation has demonstrated both the necessity and feasibility of using inter-sentential context -- context from sentences other than those currently being translated.

Document Level Machine Translation Machine Translation +1

Paper
Code

Structured Neural Summarization

3 code implementations • ICLR 2019 • Patrick Fernandes, Miltiadis Allamanis, Marc Brockschmidt

Summarization of long sequences into a concise statement is a core problem in natural language processing, requiring non-trivial understanding of the input.

Source Code Summarization

372

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.