Search Results for author: Ehsan Kamalloo

Found 14 papers, 12 papers with code

NoMIRACL: Knowing When You Don't Know for Robust Multilingual Retrieval-Augmented Generation

1 code implementation18 Dec 2023 Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, Jimmy Lin

We measure LLM robustness using two metrics: (i) hallucination rate, measuring model tendency to hallucinate an answer, when the answer is not present in passages in the non-relevant subset, and (ii) error rate, measuring model inaccuracy to recognize relevant passages in the relevant subset.

Hallucination Language Modelling +2

HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution

1 code implementation31 Jul 2023 Ehsan Kamalloo, Aref Jafari, Xinyu Zhang, Nandan Thakur, Jimmy Lin

In this paper, we introduce a new dataset, HAGRID (Human-in-the-loop Attributable Generative Retrieval for Information-seeking Dataset) for building end-to-end generative information-seeking models that are capable of retrieving candidate quotes and generating attributed explanations.

Information Retrieval Informativeness +1

Evaluating Open-Domain Question Answering in the Era of Large Language Models

1 code implementation11 May 2023 Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei

The recent success of large language models (LLMs) for QA aggravates lexical matching failures since candidate answers become longer, thereby making matching with the gold answers even more challenging.

Open-Domain Question Answering

Evaluating Embedding APIs for Information Retrieval

no code implementations10 May 2023 Ehsan Kamalloo, Xinyu Zhang, Odunayo Ogundepo, Nandan Thakur, David Alfonso-Hermelo, Mehdi Rezagholizadeh, Jimmy Lin

The ever-increasing size of language models curtails their widespread availability to the community, thereby galvanizing many companies into offering access to large language models through APIs.

Domain Generalization Information Retrieval +2

Simple Yet Effective Neural Ranking and Reranking Baselines for Cross-Lingual Information Retrieval

no code implementations3 Apr 2023 Jimmy Lin, David Alfonso-Hermelo, Vitor Jeronymo, Ehsan Kamalloo, Carlos Lassance, Rodrigo Nogueira, Odunayo Ogundepo, Mehdi Rezagholizadeh, Nandan Thakur, Jheng-Hong Yang, Xinyu Zhang

The advent of multilingual language models has generated a resurgence of interest in cross-lingual information retrieval (CLIR), which is the task of searching documents in one language with queries from another.

Cross-Lingual Information Retrieval Retrieval

Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages

1 code implementation18 Oct 2022 Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, Jimmy Lin

MIRACL (Multilingual Information Retrieval Across a Continuum of Languages) is a multilingual dataset we have built for the WSDM 2023 Cup challenge that focuses on ad hoc retrieval across 18 different languages, which collectively encompass over three billion native speakers around the world.

Information Retrieval Retrieval

FaithDial: A Faithful Benchmark for Information-Seeking Dialogue

1 code implementation22 Apr 2022 Nouha Dziri, Ehsan Kamalloo, Sivan Milton, Osmar Zaiane, Mo Yu, Edoardo M. Ponti, Siva Reddy

The goal of information-seeking dialogue is to respond to seeker queries with natural language utterances that are grounded on knowledge sources.

Dialogue Generation Hallucination

When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation

1 code implementation Findings (ACL) 2022 Ehsan Kamalloo, Mehdi Rezagholizadeh, Ali Ghodsi

From a pre-generated pool of augmented samples, Glitter adaptively selects a subset of worst-case samples with maximal loss, analogous to adversarial DA.

Data Augmentation Knowledge Distillation

Augmenting Neural Response Generation with Context-Aware Topical Attention

1 code implementation WS 2019 Nouha Dziri, Ehsan Kamalloo, Kory W. Mathewson, Osmar Zaiane

Our model is built upon the basic Seq2Seq model by augmenting it with a hierarchical joint attention mechanism that incorporates topical concepts and previous interactions into the response generation.

Open-Domain Dialog Response Generation +1

A Coherent Unsupervised Model for Toponym Resolution

1 code implementation4 May 2018 Ehsan Kamalloo, Davood Rafiei

The evaluation shows that our method outperforms the unsupervised technique as well as Reuters OpenCalais and Google Cloud Natural Language API on all three corpora; also, our method shows a performance close to that of the state-of-the-art supervised method and outperforms it when the test data has 40% or more toponyms that are not seen in the training data.

Toponym Resolution

Cannot find the paper you are looking for? You can Submit a new open access paper.