Augmented SBERT

Introduced by Thakur et al. in Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

Augmented SBERT is a data augmentation strategy for pairwise sentence scoring that uses a BERT cross-encoder to improve the performance for the SBERT bi-encoders. Given a pre-trained, well-performing crossencoder, we sample sentence pairs according to a certain sampling strategy and label these using the cross-encoder. We call these weakly labeled examples the silver dataset and they will be merged with the gold training dataset. We then train the bi-encoder on this extended training dataset.

Source: Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Domain Adaptation	1	25.00%
Semantic Textual Similarity	1	25.00%
Sentence	1	25.00%
Sentence Pair Modeling	1	25.00%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
BERT	Language Models

Categories

Add Remove

Text Augmentation