1 code implementation • 18 Dec 2023 • Nandan Thakur, Luiz Bonifacio, Xinyu Zhang, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Boxing Chen, Mehdi Rezagholizadeh, Jimmy Lin
We measure LLM robustness using two metrics: (i) hallucination rate, measuring model tendency to hallucinate an answer, when the answer is not present in passages in the non-relevant subset, and (ii) error rate, measuring model inaccuracy to recognize relevant passages in the relevant subset.
1 code implementation • 31 Jul 2023 • Ehsan Kamalloo, Aref Jafari, Xinyu Zhang, Nandan Thakur, Jimmy Lin
In this paper, we introduce a new dataset, HAGRID (Human-in-the-loop Attributable Generative Retrieval for Information-seeking Dataset) for building end-to-end generative information-seeking models that are capable of retrieving candidate quotes and generating attributed explanations.
2 code implementations • 13 Jun 2023 • Ehsan Kamalloo, Nandan Thakur, Carlos Lassance, Xueguang Ma, Jheng-Hong Yang, Jimmy Lin
BEIR is a benchmark dataset for zero-shot evaluation of information retrieval models across 18 different domain/task combinations.
1 code implementation • 11 May 2023 • Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
The recent success of large language models (LLMs) for QA aggravates lexical matching failures since candidate answers become longer, thereby making matching with the gold answers even more challenging.
no code implementations • 10 May 2023 • Ehsan Kamalloo, Xinyu Zhang, Odunayo Ogundepo, Nandan Thakur, David Alfonso-Hermelo, Mehdi Rezagholizadeh, Jimmy Lin
The ever-increasing size of language models curtails their widespread availability to the community, thereby galvanizing many companies into offering access to large language models through APIs.
no code implementations • 3 Apr 2023 • Jimmy Lin, David Alfonso-Hermelo, Vitor Jeronymo, Ehsan Kamalloo, Carlos Lassance, Rodrigo Nogueira, Odunayo Ogundepo, Mehdi Rezagholizadeh, Nandan Thakur, Jheng-Hong Yang, Xinyu Zhang
The advent of multilingual language models has generated a resurgence of interest in cross-lingual information retrieval (CLIR), which is the task of searching documents in one language with queries from another.
1 code implementation • 18 Oct 2022 • Xinyu Zhang, Nandan Thakur, Odunayo Ogundepo, Ehsan Kamalloo, David Alfonso-Hermelo, Xiaoguang Li, Qun Liu, Mehdi Rezagholizadeh, Jimmy Lin
MIRACL (Multilingual Information Retrieval Across a Continuum of Languages) is a multilingual dataset we have built for the WSDM 2023 Cup challenge that focuses on ad hoc retrieval across 18 different languages, which collectively encompass over three billion native speakers around the world.
1 code implementation • ACM International Conference on Information & Knowledge Management (CIKM) 2022 • Mehdi Akbarian Rastaghi, Ehsan Kamalloo, Davood Rafiei
The paradigm of fine-tuning Pre-trained Language Models (PLMs) has been successful in Entity Matching (EM).
Ranked #3 on Entity Resolution on Amazon-Google
1 code implementation • 22 Apr 2022 • Nouha Dziri, Ehsan Kamalloo, Sivan Milton, Osmar Zaiane, Mo Yu, Edoardo M. Ponti, Siva Reddy
The goal of information-seeking dialogue is to respond to seeker queries with natural language utterances that are grounded on knowledge sources.
1 code implementation • Findings (ACL) 2022 • Ehsan Kamalloo, Mehdi Rezagholizadeh, Ali Ghodsi
From a pre-generated pool of augmented samples, Glitter adaptively selects a subset of worst-case samples with maximal loss, analogous to adversarial DA.
1 code implementation • Findings (ACL) 2021 • Ehsan Kamalloo, Mehdi Rezagholizadeh, Peyman Passban, Ali Ghodsi
We exploit a semi-supervised approach based on KD to train a model on augmented data.
1 code implementation • NAACL 2019 • Nouha Dziri, Ehsan Kamalloo, Kory W. Mathewson, Osmar Zaiane
Evaluating open-domain dialogue systems is difficult due to the diversity of possible correct answers.
1 code implementation • WS 2019 • Nouha Dziri, Ehsan Kamalloo, Kory W. Mathewson, Osmar Zaiane
Our model is built upon the basic Seq2Seq model by augmenting it with a hierarchical joint attention mechanism that incorporates topical concepts and previous interactions into the response generation.
1 code implementation • 4 May 2018 • Ehsan Kamalloo, Davood Rafiei
The evaluation shows that our method outperforms the unsupervised technique as well as Reuters OpenCalais and Google Cloud Natural Language API on all three corpora; also, our method shows a performance close to that of the state-of-the-art supervised method and outperforms it when the test data has 40% or more toponyms that are not seen in the training data.