TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	EXTRA DATA	REMOVE
Question Answering	TrecQA	Contextual DeBERTa-V3-Large + SSP	MAP	0.919	# 4
Question Answering	TrecQA	Contextual DeBERTa-V3-Large + SSP	MRR	0.945	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/context-aware-transformer-pre-training-for/question-answering-on-trecqa)](https://paperswithcode.com/sota/question-answering-on-trecqa?p=context-aware-transformer-pre-training-for)`

Context-Aware Transformer Pre-Training for Answer Sentence Selection

24 May 2023 · Luca Di Liello, Siddhant Garg, Alessandro Moschitti ·

Answer Sentence Selection (AS2) is a core component for building an accurate Question Answering pipeline. AS2 models rank a set of candidate sentences based on how likely they answer a given question. The state of the art in AS2 exploits pre-trained transformers by transferring them on large annotated datasets, while using local contextual information around the candidate sentence. In this paper, we propose three pre-training objectives designed to mimic the downstream fine-tuning task of contextual AS2. This allows for specializing LMs when fine-tuning for contextual AS2. Our experiments on three public and two large-scale industrial datasets show that our pre-training approaches (applied to RoBERTa and ELECTRA) can improve baseline contextual AS2 accuracy by up to 8% on some datasets.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Question Answering

Sentence

Datasets

NewsQA

WikiQA

TrecQA ASNQ

Results from the Paper

Add Remove

Ranked #4 on Question Answering on TrecQA (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Uses Extra Training Data	Benchmark
Question Answering	TrecQA	Contextual DeBERTa-V3-Large + SSP	MAP	0.919	# 4		Compare
Question Answering	TrecQA	Contextual DeBERTa-V3-Large + SSP	MRR	0.945	# 7		Compare

Methods

Add Remove

Adam • Attention Dropout • BERT • Dense Connections • Dropout • GELU • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • Multi-Head Attention • Residual Connection • RoBERTa • Scaled Dot-Product Attention • Softmax • Weight Decay • WordPiece

Edit Social Preview

Context-Aware Transformer Pre-Training for Answer Sentence Selection

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove