Search Results for author: David Alfonso-Hermelo

Found 6 papers, 2 papers with code

Refining an Almost Clean Translation Memory Helps Machine Translation

no code implementations AMTA 2022 Shivendra Bhardwa, David Alfonso-Hermelo, Philippe Langlais, Gabriel Bernier-Colborne, Cyril Goutte, Michel Simard

While recent studies have been dedicated to cleaning very noisy parallel corpora to improve Machine Translation training, we focus in this work on filtering a large and mostly clean Translation Memory.

Machine Translation Translation

NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation

1 code implementation18 Dec 2023 Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, Jimmy Lin

We measure LLM robustness using two metrics: (i) hallucination rate, measuring model tendency to hallucinate an answer, when the answer is not present in passages in the non-relevant subset, and (ii) error rate, measuring model inaccuracy to recognize relevant passages in the relevant subset.

Hallucination Language Modelling +2

Evaluating Embedding APIs for Information Retrieval

no code implementations10 May 2023 Ehsan Kamalloo, Xinyu Zhang, Odunayo Ogundepo, Nandan Thakur, David Alfonso-Hermelo, Mehdi Rezagholizadeh, Jimmy Lin

The ever-increasing size of language models curtails their widespread availability to the community, thereby galvanizing many companies into offering access to large language models through APIs.

Domain Generalization Information Retrieval +2

Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval

no code implementations3 Apr 2023 Jimmy Lin, David Alfonso-Hermelo, Vitor Jeronymo, Ehsan Kamalloo, Carlos Lassance, Rodrigo Nogueira, Odunayo Ogundepo, Mehdi Rezagholizadeh, Nandan Thakur, Jheng-Hong Yang, Xinyu Zhang

The advent of multilingual language models has generated a resurgence of interest in cross-lingual information retrieval (CLIR), which is the task of searching documents in one language with queries from another.

Cross-Lingual Information Retrieval Retrieval

Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages

1 code implementation18 Oct 2022 Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, Jimmy Lin

MIRACL (Multilingual Information Retrieval Across a Continuum of Languages) is a multilingual dataset we have built for the WSDM 2023 Cup challenge that focuses on ad hoc retrieval across 18 different languages, which collectively encompass over three billion native speakers around the world.

Information Retrieval Retrieval

NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation

no code implementations9 Nov 2021 David Alfonso-Hermelo, Ahmad Rashid, Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh

We apply NATURE to common slot-filling and intent detection benchmarks and demonstrate that simple perturbations from the standard evaluation set by NATURE can deteriorate model performance significantly.

Intent Detection slot-filling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.