Open-Domain Question Answering
197 papers with code • 15 benchmarks • 26 datasets
Open-domain question answering is the task of question answering on open-domain datasets such as Wikipedia.
Libraries
Use these libraries to find Open-Domain Question Answering models and implementationsLatest papers
Beyond Memorization: The Challenge of Random Memory Access in Language Models
Through carefully-designed synthetic tasks, covering the scenarios of full recitation, selective recitation and grounded question answering, we reveal that LMs manage to sequentially access their memory while encountering challenges in randomly accessing memorized content.
REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering
By combining the improvements in both architecture and training, our proposed REAR can better utilize external knowledge by effectively perceiving the relevance of retrieved documents.
RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering
Based on our findings, we propose Time-Aware Adaptive Retrieval (TA-ARE), a simple yet effective method that helps LLMs assess the necessity of retrieval without calibration or additional training.
Pre-training Cross-lingual Open Domain Question Answering with Large-scale Synthetic Supervision
Cross-lingual question answering (CLQA) is a complex problem, comprising cross-lingual retrieval from a multilingual knowledge base, followed by answer generation either in English or the query language.
VerAs: Verify then Assess STEM Lab Reports
With an increasing focus in STEM education on critical thinking skills, science writing plays an ever more important role in curricula that stress inquiry skills.
Can AI Assistants Know What They Don't Know?
To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets.
Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization
Hard negative sampling, which is commonly used to improve contrastive learning, can introduce more noise in training.
Learning to Filter Context for Retrieval-Augmented Generation
To alleviate these problems, we propose FILCO, a method that improves the quality of the context provided to the generator by (1) identifying useful context based on lexical and information-theoretic approaches, and (2) training context filtering models that can filter retrieved contexts at test time.
Detrimental Contexts in Open-Domain Question Answering
However, counter-intuitively, too much context can have a negative impact on the model when evaluated on common question answering (QA) datasets.
Knowledge Corpus Error in Question Answering
This error arises when the knowledge corpus used for retrieval is only a subset of the entire string space, potentially excluding more helpful passages that exist outside the corpus.