IR Models and the COVID-19 Pandemic: A Comparative Study of Performance and Challenges

21 May 2023 · Moksh Shukla, Nitik Jain, Shubham Gupta ·

This research study investigates the efficiency of different information retrieval (IR) systems in accessing relevant information from the scientific literature during the COVID-19 pandemic. The study applies the TREC framework to the COVID-19 Open Research Dataset (CORD-19) and evaluates BM25, Contriever, and Bag of Embeddings IR frameworks. The objective is to build a test collection for search engines that tackle the complex information landscape during a pandemic. The study uses the CORD-19 dataset to train and evaluate the IR models and compares the results to those manually labeled in the TREC-COVID IR Challenge. The results indicate that advanced IR models like BERT and Contriever better retrieve relevant information during a pandemic. However, the study also highlights the challenges in processing large datasets and the need for strategies to focus on abstracts or summaries. Overall, the research highlights the importance of effectively tailored IR systems in dealing with information overload during crises like COVID-19 and can guide future research and development in this field.

PDF Abstract