Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection
An important task for designing QA systems is answer sentence selection (AS2): selecting the sentence containing (or constituting) the answer to a question from a set of retrieved relevant documents. In this paper, we propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transformers for AS2, and mitigate the requirement of large labeled datasets. Specifically, the model is tasked to predict whether: (i) two sentences are extracted from the same paragraph, (ii) a given sentence is extracted from a given paragraph, and (iii) two paragraphs are extracted from the same document. Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models such as RoBERTa and ELECTRA for AS2.
PDF AbstractTask | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Answer Selection | ASNQ | ELECTRA-Base + SSP | MAP | 0.697 | # 2 | |
MRR | 0.757 | # 2 | ||||
Answer Selection | ASNQ | DeBERTa-V3-Large + SSP | MAP | 0.743 | # 1 | |
MRR | 0.800 | # 1 | ||||
Question Answering | TrecQA | RoBERTa-Base + PSD | MAP | 0.903 | # 7 | |
MRR | 0.951 | # 5 | ||||
Question Answering | TrecQA | DeBERTa-V3-Large + SSP | MAP | 0.923 | # 3 | |
MRR | 0.946 | # 6 | ||||
Question Answering | WikiQA | DeBERTa-Large + SSP | MAP | 0.901 | # 5 | |
MRR | 0.914 | # 4 | ||||
Question Answering | WikiQA | DeBERTa-V3-Large + ALL | MAP | 0.909 | # 4 | |
MRR | 0.920 | # 3 | ||||
Question Answering | WikiQA | RoBERTa-Base + SSP | MAP | 0.887 | # 6 | |
MRR | 0.899 | # 7 |