Text Augmentation

Augmented SBERT is a data augmentation strategy for pairwise sentence scoring that uses a BERT cross-encoder to improve the performance for the SBERT bi-encoders. Given a pre-trained, well-performing crossencoder, we sample sentence pairs according to a certain sampling strategy and label these using the cross-encoder. We call these weakly labeled examples the silver dataset and they will be merged with the gold training dataset. We then train the bi-encoder on this extended training dataset.

Source: Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Domain Adaptation 1 25.00%
Semantic Textual Similarity 1 25.00%
Sentence 1 25.00%
Sentence Pair Modeling 1 25.00%

Components


Component Type
BERT
Language Models

Categories