Semantic textual similarity deals with determining how similar two pieces of texts are. This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.
Ranked #1 on Semantic Textual Similarity on MRPC
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).
Ranked #1 on Sentiment Analysis on SST-2 Binary classification
COMMON SENSE REASONING COREFERENCE RESOLUTION DOCUMENT SUMMARIZATION LINGUISTIC ACCEPTABILITY MACHINE TRANSLATION NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS TEXT CLASSIFICATION TRANSFER LEARNING WORD SENSE DISAMBIGUATION
As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.
Ranked #6 on Semantic Textual Similarity on MRPC
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
Ranked #2 on Natural Language Inference on ANLI test (using extra training data)
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
Ranked #1 on Text Classification on IMDb
We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.
Ranked #7 on Natural Language Inference on SNLI
Recently, pre-trained models have achieved state-of-the-art results in various language understanding tasks, which indicates that pre-training on large-scale corpora may play a crucial role in natural language processing.
Ranked #1 on Open-Domain Question Answering on DuReader
CHINESE NAMED ENTITY RECOGNITION CHINESE READING COMPREHENSION CHINESE SENTENCE PAIR CLASSIFICATION CHINESE SENTIMENT ANALYSIS LINGUISTIC ACCEPTABILITY MULTI-TASK LEARNING NATURAL LANGUAGE INFERENCE OPEN-DOMAIN QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS
We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration).
Ranked #2 on Chinese Sentence Pair Classification on LCQMC Dev
CHINESE NAMED ENTITY RECOGNITION CHINESE SENTENCE PAIR CLASSIFICATION CHINESE SENTIMENT ANALYSIS NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC SIMILARITY SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS
However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.
Ranked #5 on Semantic Textual Similarity on STS Benchmark (Spearman Correlation metric)
In this work, we present a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model.
Ranked #1 on Semantic Textual Similarity on SentEval