no code implementations • COLING 2022 • Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil
Research on neural IR has so far been focused primarily on standard supervised learning settings, where it outperforms traditional term matching baselines.
no code implementations • 27 Feb 2024 • Keshav Ramji, Young-suk Lee, Ramón Fernandez Astudillo, Md Arafat Sultan, Tahira Naseem, Asim Munawar, Radu Florian, Salim Roukos
It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response.
no code implementations • 19 Feb 2024 • Md Arafat Sultan, Jatin Ganhotra, Ramón Fernandez Astudillo
We introduce a structured chain-of-thought (SCoT) prompting approach to generating content-grounded multi-turn question-answer conversations using a pre-trained large language model (LLM).
no code implementations • 12 Jan 2024 • Md Arafat Sultan, Aashka Trivedi, Parul Awasthy, Avirup Sil
We present a large-scale empirical study of how choices of configuration parameters affect performance in knowledge distillation (KD).
no code implementations • 15 Nov 2023 • Jiachen Zhao, Wenlong Zhao, Andrew Drozdov, Benjamin Rozonoyer, Md Arafat Sultan, Jay-Yoon Lee, Mohit Iyyer, Andrew McCallum
In this paper, we present the discovery that a student model distilled from a few-shot prompted LLM can commonly generalize better than its teacher to unseen examples on such tasks.
1 code implementation • 21 Oct 2023 • Young-suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo
Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision.
no code implementations • 19 May 2023 • Revanth Gangi Reddy, Pradeep Dasigi, Md Arafat Sultan, Arman Cohan, Avirup Sil, Heng Ji, Hannaneh Hajishirzi
Neural information retrieval often adopts a retrieve-and-rerank framework: a bi-encoder network first retrieves K (e. g., 100) candidates that are then re-ranked using a more powerful cross-encoder model to rank the better candidates higher.
1 code implementation • 1 Mar 2023 • Jon Saad-Falcon, Omar Khattab, Keshav Santhanam, Radu Florian, Martin Franz, Salim Roukos, Avirup Sil, Md Arafat Sultan, Christopher Potts
Many information retrieval tasks require large labeled datasets for fine-tuning.
no code implementations • 30 Jan 2023 • Md Arafat Sultan
Originally proposed as a method for knowledge transfer from one model to another, some recent studies have suggested that knowledge distillation (KD) is in fact a form of regularization.
1 code implementation • 23 Jan 2023 • Avirup Sil, Jaydeep Sen, Bhavani Iyer, Martin Franz, Kshitij Fadnis, Mihaela Bornea, Sara Rosenthal, Scott McCarley, Rong Zhang, Vishwajeet Kumar, Yulong Li, Md Arafat Sultan, Riyaz Bhat, Radu Florian, Salim Roukos
The field of Question Answering (QA) has made remarkable progress in recent years, thanks to the advent of large pre-trained language models, newer realistic benchmark datasets with leaderboards, and novel algorithms for key components such as retrievers and readers.
no code implementations • 2 Dec 2022 • Keshav Santhanam, Jon Saad-Falcon, Martin Franz, Omar Khattab, Avirup Sil, Radu Florian, Md Arafat Sultan, Salim Roukos, Matei Zaharia, Christopher Potts
Neural information retrieval (IR) systems have progressed rapidly in recent years, in large part due to the release of publicly available benchmarking tasks.
1 code implementation • 29 Nov 2022 • Ameet Deshpande, Md Arafat Sultan, Anthony Ferritto, Ashwin Kalyan, Karthik Narasimhan, Avirup Sil
Fine-tuning pre-trained language models (PLMs) achieves impressive performance on a range of downstream tasks, and their sizes have consequently been getting bigger.
no code implementations • 16 Jun 2022 • Scott McCarley, Mihaela Bornea, Sara Rosenthal, Anthony Ferritto, Md Arafat Sultan, Avirup Sil, Radu Florian
Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types.
no code implementations • 15 May 2022 • Md Arafat Sultan, Avirup Sil, Radu Florian
Machine learning models are prone to overfitting their training (source) domains, which is commonly believed to be the reason why they falter in novel target domains.
1 code implementation • 24 Apr 2022 • Revanth Gangi Reddy, Md Arafat Sultan, Martin Franz, Avirup Sil, Heng Ji
On two public IR benchmarks, we empirically show that the proposed method helps improve both the model's attention patterns and retrieval performance, including in zero-shot settings.
no code implementations • 20 Apr 2022 • Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avirup Sil, Vittorio Castelli, Radu Florian, Salim Roukos
Neural passage retrieval is a new and promising approach in open retrieval question answering.
1 code implementation • NAACL 2022 • Yulong Li, Martin Franz, Md Arafat Sultan, Bhavani Iyer, Young-suk Lee, Avirup Sil
We present DR. DECR (Dense Retrieval with Distillation-Enhanced Cross-Lingual Representation), a new cross-lingual information retrieval (CLIR) system trained using multi-stage knowledge distillation (KD).
Cross-Lingual Information Retrieval Knowledge Distillation +3
no code implementations • 15 Apr 2021 • Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil
Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems.
no code implementations • 2 Dec 2020 • Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avi Sil, Vittorio Castelli, Radu Florian, Salim Roukos
End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Revanth Gangi Reddy, Md Arafat Sultan, Efsun Sarioglu Kayi, Rong Zhang, Vittorio Castelli, Avirup Sil
Answer validation in machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair.
no code implementations • 24 Oct 2020 • Yanda Chen, Md Arafat Sultan, Vittorio Castelli
Automatically generated synthetic training examples have been shown to improve performance in machine reading comprehension (MRC).
no code implementations • EMNLP 2020 • Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avirup Sil, Todd Ward
Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain.