Paraphrase Identification

72 papers with code • 10 benchmarks • 17 datasets

The goal of Paraphrase Identification is to determine whether a pair of sentences have the same meaning.

Source: Adversarial Examples with Difficult Common Words for Paraphrase Identification

Image source: On Paraphrase Identification Corpora

Benchmarks

Add a Result

These leaderboards are used to track progress in Paraphrase Identification

Dataset	Best Model	Compare
Quora Question Pairs	ALICE	See all
MSRP	FEAT2, TFKLD, SVM, Fine-grained features	See all
Quora Question Pairs Dev	BERT + SCH attm	See all
2017_test set	CNN	See all
WikiHop	StructBERTRoBERTa ensemble	See all
TURL	TSDAE	See all
PIT	TSDAE	See all
AP	RoBETRa base	See all
IMDb	SplitEE-S	See all
Yelp	SplitEE-S	See all

Libraries

Use these libraries to find Paraphrase Identification models and implementations

huggingface/transformers

3 papers

125,425

kaushaltrivedi/fast-bert

3 papers

1,847

utterworks/fast-bert

3 papers

1,847

labmlai/annotated_deep_learning_pap…

2 papers

48,508

See all 10 libraries.

Datasets

Most implemented papers

Most implemented Social Latest No code

Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language

deepmipt/bert • • 17 May 2019

This work shows that transfer learning from a multilingual model to monolingual model results in significant growth of performance on such tasks as reading comprehension, paraphrase detection, and sentiment analysis.

Paper
Code

ERNIE: Enhanced Language Representation with Informative Entities

thunlp/ERNIE • • ACL 2019

Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks.

Paper
Code

Dice Loss for Data-imbalanced NLP Tasks

ShannonAI/dice_loss_for_NLP • • ACL 2020

Many NLP tasks such as tagging and machine reading comprehension are faced with the severe data imbalance issue: negative examples significantly outnumber positive examples, and the huge number of background examples (or easy-negative examples) overwhelms the training.

Paper
Code

Pay Attention when Required

NVIDIA/DeepLearningExamples • • 9 Sep 2020

Transformer-based models consist of interleaved feed-forward blocks - that capture content meaning, and relatively more expensive self-attention blocks - that capture context meaning.

Paper
Code

Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning

rabeehk/compacter • • ACL 2021

Although pretrained language models can be fine-tuned to produce state-of-the-art results for a very wide range of language understanding tasks, the dynamics of this process are not well understood, especially in the low data regime.

Paper
Code

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

google-research/google-research • • ICLR 2022

In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model.

Paper
Code

Convolutional Neural Network for Paraphrase Identification

chantera/bicnn-mi • HLT 2015

Paper
Code

Idiom Paraphrases: Seventh Heaven vs Cloud Nine

masha-p/Idiom_Paraphrases • WS 2015

Paper
Code

Sentence Similarity Learning by Lexical Decomposition and Composition

Leputa/CIKM-AnalytiCup-2018 • • COLING 2016

Most conventional sentence similarity methods only focus on similar parts of two input sentences, and simply ignore the dissimilar parts, which usually give us some clues and semantic meanings about the sentences.

Paper
Code

A Study of MatchPyramid Models on Ad-hoc Retrieval

albpurpura/PE4IR • • 15 Jun 2016

Although ad-hoc retrieval can also be formalized as a text matching task, few deep models have been tested on it.

Paper
Code

Paraphrase Identification

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result