Semantic Textual Similarity

560 papers with code • 13 benchmarks • 17 datasets

Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

Image source: Learning Semantic Textual Similarity from Conversations

Benchmarks

Add a Result

These leaderboards are used to track progress in Semantic Textual Similarity

Dataset	Best Model	Compare
STS Benchmark	MT-DNN-SMART	See all
MRPC	MT-DNN-SMART	See all
MTEB	ST5-XXL	See all
STS13	AnglE-LLaMA-13B	See all
SICK	PromCSE-RoBERTa-large (0.355B)	See all
STS12	PromptEOL+CSE+OPT-13B	See all
STS14	AnglE-LLaMA-13B	See all
STS15	AnglE-LLaMA-13B	See all
STS16	AnglE-LLaMA-13B	See all
SentEval	GenSen	See all
CxC	PromCSE-RoBERTa-large (0.355B)	See all
SICK-R	AnglE-LLaMA-7B	See all
MRPC Dev	Synthesizer (R+V)	See all

Show all 13 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Semantic Textual Similarity models and implementations

huggingface/transformers

9 papers

125,385

facebookresearch/xformers

3 papers

7,627

facebookresearch/InferSent

3 papers

2,279

namisan/mt-dnn

3 papers

2,202

See all 11 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention

mlpen/Nystromformer • • 7 Feb 2021

The scalability of Nystr\"{o}mformer enables application to longer sequences with thousands of tokens.

Paper
Code

MedSTS: A Resource for Clinical Semantic Textual Similarity

ncbi-nlp/BioSentVec • 28 Aug 2018

A subset of MedSTS (MedSTS_ann) containing 1, 068 sentence pairs was annotated by two medical experts with semantic similarity scores of 0-5 (low to high similarity).

Paper
Code

Q8BERT: Quantized 8Bit BERT

NervanaSystems/nlp-architect • • 14 Oct 2019

Recently, pre-trained Transformer based language models such as BERT and GPT, have shown great improvement in many Natural Language Processing (NLP) tasks.

Paper
Code

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

tensorflow/models • • ACL 2020

Then, we conduct knowledge transfer from this teacher to MobileBERT.

Paper
Code

RealFormer: Transformer Likes Residual Attention

google-research/google-research • • Findings (ACL) 2021

Transformer is the backbone of modern NLP models.

Paper
Code

Calculating the similarity between words and sentences using a lexical database and corpus statistics

nihitsaxena95/sentence-similarity-wordnet-sementic • 15 Feb 2018

To calculate the semantic similarity between words and sentences, the proposed method follows an edge-based approach using a lexical database.

Paper
Code

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

facebookresearch/SentEval • • ICLR 2018

In this work, we present a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model.

Paper
Code

SqueezeBERT: What can computer vision teach NLP about efficient neural networks?

huggingface/transformers • • EMNLP (sustainlp) 2020

Humans read and write hundreds of billions of messages every day.

Paper
Code

How to Train BERT with an Academic Budget

peteriz/academic-budget-bert • • EMNLP 2021

While large language models a la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few well-funded industry labs can afford.

Paper
Code

Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding

shanzhenren/PLE • 17 Feb 2016

Current systems of fine-grained entity typing use distant supervision in conjunction with existing knowledge bases to assign categories (type labels) to entity mentions.

Paper
Code

Semantic Textual Similarity

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result