no code implementations • 26 Dec 2023 • Mofetoluwa Adeyemi, Akintunde Oladipo, Ronak Pradeep, Jimmy Lin
Our implementation covers English and four African languages (Hausa, Somali, Swahili, and Yoruba) and we examine cross-lingual reranking with queries in English and passages in the African languages.
1 code implementation • 26 Dec 2023 • Manveer Singh Tamber, Ronak Pradeep, Jimmy Lin
We present a range of models from 220M parameters to 3B parameters, all with strong reranking results, challenging the necessity of large-scale models for effective zero-shot reranking and opening avenues for more efficient listwise reranking solutions.
1 code implementation • 5 Dec 2023 • Ronak Pradeep, Sahel Sharifymoghaddam, Jimmy Lin
In information retrieval, proprietary large language models (LLMs) such as GPT-4 and open-source counterparts such as LLaMA and Vicuna have played a vital role in reranking.
1 code implementation • 26 Sep 2023 • Ronak Pradeep, Sahel Sharifymoghaddam, Jimmy Lin
Researchers have successfully applied large language models (LLMs) such as ChatGPT to reranking in an information retrieval context, but to date, such work has mostly been built on proprietary models hidden behind opaque API endpoints.
no code implementations • 29 Aug 2023 • Jimmy Lin, Ronak Pradeep, Tommaso Teofili, Jasper Xian
We provide a reproducible, end-to-end demonstration of vector search with OpenAI embeddings using Lucene on the popular MS MARCO passage ranking test collection.
1 code implementation • 13 Jun 2023 • Dake Zhang, Ronak Pradeep
With the rapid growth and spread of online misinformation, people need tools to help them evaluate the credibility and accuracy of online information.
no code implementations • 19 May 2023 • Ronak Pradeep, Kai Hui, Jai Gupta, Adam D. Lelkes, Honglei Zhuang, Jimmy Lin, Donald Metzler, Vinh Q. Tran
Popularized by the Differentiable Search Index, the emerging paradigm of generative retrieval re-frames the classic information retrieval problem into a sequence-to-sequence modeling task, forgoing external indices and encoding an entire document corpus within a single Transformer.
no code implementations • 3 May 2023 • Xueguang Ma, Xinyu Zhang, Ronak Pradeep, Jimmy Lin
Supervised ranking methods based on bi-encoder or cross-encoder architectures have shown success in multi-stage text ranking tasks, but they require large amounts of relevance judgments as training data.
no code implementations • ACL 2021 • Kelvin Jiang, Ronak Pradeep, Jimmy Lin
This work explores a framework for fact verification that leverages pretrained sequence-to-sequence transformer models for sentence selection and label prediction, two key sub-tasks in fact verification.
1 code implementation • 12 Apr 2021 • Xueguang Ma, Kai Sun, Ronak Pradeep, Jimmy Lin
Text retrieval using learned dense representations has recently emerged as a promising alternative to "traditional" text retrieval using sparse bag-of-words representations.
1 code implementation • 19 Feb 2021 • Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, Rodrigo Nogueira
Pyserini is an easy-to-use Python toolkit that supports replicable IR research by providing effective first-stage retrieval in a multi-stage ranking architecture.
Cultural Vocal Bursts Intensity Prediction Information Retrieval +1
2 code implementations • 14 Jan 2021 • Ronak Pradeep, Rodrigo Nogueira, Jimmy Lin
We propose a design pattern for tackling text ranking problems, dubbed "Expando-Mono-Duo", that has been empirically validated for a number of ad hoc retrieval tasks in different domains.
no code implementations • EACL (Louhi) 2021 • Ronak Pradeep, Xueguang Ma, Rodrigo Nogueira, Jimmy Lin
This work describes the adaptation of a pretrained sequence-to-sequence model to the task of scientific claim verification in the biomedical domain.
1 code implementation • EMNLP (sdp) 2020 • Edwin Zhang, Nikhil Gupta, Raphael Tang, Xiao Han, Ronak Pradeep, Kuang Lu, Yue Zhang, Rodrigo Nogueira, Kyunghyun Cho, Hui Fang, Jimmy Lin
We present Covidex, a search engine that exploits the latest neural ranking models to provide information access to the COVID-19 Open Research Dataset curated by the Allen Institute for AI.