Natural language inference is the task of determining whether a "hypothesis" is true (entailment), false (contradiction), or undetermined (neutral) given a "premise".
|A man inspects the uniform of a figure in some East Asian country.||contradiction||The man is sleeping.|
|An older and younger man smiling.||neutral||Two men are smiling and laughing at the cats playing on the floor.|
|A soccer game with multiple males playing.||entailment||Some men are playing a sport.|
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
SOTA for Common Sense Reasoning on SWAG
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
SOTA for Question Answering on SQuAD2.0 dev (using extra training data)
We measure the performance of CamemBERT compared to multilingual models in multiple downstream tasks, namely part-of-speech tagging, dependency parsing, named-entity recognition, and natural language inference.
As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.
#5 best model for Semantic Textual Similarity on MRPC
Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.
SOTA for Natural Language Inference on QNLI
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).
#2 best model for Sentiment Analysis on SST-5 Fine-grained classification
Recently, pre-trained models have achieved state-of-the-art results in various language understanding tasks, which indicates that pre-training on large-scale corpora may play a crucial role in natural language processing.
#5 best model for Question Answering on Quora Question Pairs
In this technical report, we adapt whole word masking in Chinese text, that masking the whole word instead of masking Chinese characters, which could bring another challenge in Masked Language Model (MLM) pre-training task.