STS

103 papers with code • 0 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in STS

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find STS models and implementations

UKPLab/sentence-transformers

2 papers

13,783

princeton-nlp/SimCSE

2 papers

3,243

climsocana/tecb-de

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

UKPLab/sentence-transformers • • IJCNLP 2019

However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.

Paper
Code

SimCSE: Simple Contrastive Learning of Sentence Embeddings

princeton-nlp/SimCSE • • EMNLP 2021

This paper presents SimCSE, a simple contrastive learning framework that greatly advances state-of-the-art sentence embeddings.

Paper
Code

TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning

UKPLab/sentence-transformers • • 14 Apr 2021

Learning sentence embeddings often requires a large amount of labeled data.

Paper
Code

MedSTS: A Resource for Clinical Semantic Textual Similarity

ncbi-nlp/BioSentVec • 28 Aug 2018

A subset of MedSTS (MedSTS_ann) containing 1, 068 sentence pairs was annotated by two medical experts with semantic similarity scores of 0-5 (low to high similarity).

Paper
Code

SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

txsun1997/metric-fairness • • 31 Jul 2017

Semantic Textual Similarity (STS) measures the meaning similarity of sentences.

Paper
Code

KorNLI and KorSTS: New Benchmark Datasets for Korean Natural Language Understanding

kakaobrain/KorNLUDatasets • Findings of the Association for Computational Linguistics 2020

Although several benchmark datasets for those tasks have been released in English and a few other languages, there are no publicly available NLI or STS datasets in the Korean language.

Paper
Code

Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors

Babylonpartners/fuzzymax • ICLR 2019

Recent literature suggests that averaged word vectors followed by simple post-processing outperform many deep learning methods on semantic textual similarity tasks.

Paper
Code

FFCI: A Framework for Interpretable Automatic Evaluation of Summarization

fajri91/ffci • 27 Nov 2020

In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that comprises four elements: faithfulness (degree of factual consistency with the source), focus (precision of summary content relative to the reference), coverage (recall of summary content relative to the reference), and inter-sentential coherence (document fluency between adjacent sentences).

Paper
Code

PatentSBERTa: A Deep NLP based Hybrid Model for Patent Distance and Classification using Augmented SBERT

AI-Growth-Lab/Patent-Classification • • 22 Mar 2021

This study provides an efficient approach for using text data to calculate patent-to-patent (p2p) technological similarity, and presents a hybrid framework for leveraging the resulting p2p similarity for applications such as semantic search and automated patent classification.

Paper
Code

Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models

google-research/t5x_retrieval • • Findings (ACL) 2022

To support our investigation, we establish a new sentence representation transfer benchmark, SentGLUE, which extends the SentEval toolkit to nine tasks from the GLUE benchmark.

Paper
Code

STS

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result