Semantic Textual Similarity

564 papers with code • 13 benchmarks • 17 datasets

Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

Image source: Learning Semantic Textual Similarity from Conversations

Libraries

Use these libraries to find Semantic Textual Similarity models and implementations

Contrastive Learning in Distilled Models

kennethlimjf/contrastive-learning-in-distilled-models 23 Jan 2024

Natural Language Processing models like BERT can provide state-of-the-art word embeddings for downstream NLP tasks.

0
23 Jan 2024

Noise Contrastive Estimation-based Matching Framework for Low-Resource Security Attack Pattern Recognition

tumeteor/ttp-mapping 18 Jan 2024

Tactics, Techniques and Procedures (TTPs) represent sophisticated attack patterns in the cybersecurity domain, described encyclopedically in textual knowledge bases.

3
18 Jan 2024

A character-based steganography using masked language modeling

emirozturk/MLMStego IEEE Access 2024

In this study, a steganography method based on BERT transformer model is proposed for hiding text data in cover text.

3
15 Jan 2024

Do Vision and Language Encoders Represent the World Similarly?

mayug/0-shot-llm-vision 10 Jan 2024

In the absence of statistical similarity in aligned encoders like CLIP, we show that a possible matching of unaligned encoders exists without any training.

4
10 Jan 2024

PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging

jinlhe/pefomed 5 Jan 2024

In this paper, we propose a parameter efficient framework for fine-tuning MLLMs, specifically validated on medical visual question answering (Med-VQA) and medical report generation (MRG) tasks, using public benchmark datasets.

11
05 Jan 2024

Unsupervised hard Negative Augmentation for contrastive learning

claudiashu/una 5 Jan 2024

We present Unsupervised hard Negative Augmentation (UNA), a method that generates synthetic negative instances based on the term frequency-inverse document frequency (TF-IDF) retrieval model.

6
05 Jan 2024

Are we describing the same sound? An analysis of word embedding spaces of expressive piano performance

cpjku/performance_embeddings_fire23 31 Dec 2023

Using a music research dataset of free text performance characterizations and a follow-up study sorting the annotations into clusters, we derive a ground truth for a domain-specific semantic similarity structure.

2
31 Dec 2023

Def2Vec: Extensible Word Embeddings from Dictionary Definitions

IreneMorazzoni/def_2_vec_irene ICNLSP 2023

Def2Vec introduces a novel paradigm for word embeddings, leveraging dictionary definitions to learn semantic representations.

0
16 Dec 2023

Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models

xinjin95/binsum 15 Dec 2023

Binary code summarization, while invaluable for understanding code semantics, is challenging due to its labor-intensive nature.

13
15 Dec 2023

Explicitly Integrating Judgment Prediction with Legal Document Retrieval: A Law-Guided Generative Approach

e-qin/gear 15 Dec 2023

Legal document retrieval and judgment prediction are crucial tasks in intelligent legal systems.

0
15 Dec 2023