Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

ACL 2019 huggingface/transformers

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.

LANGUAGE MODELLING

XLNet: Generalized Autoregressive Pretraining for Language Understanding

NeurIPS 2019 huggingface/transformers

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

DOCUMENT RANKING LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE QUESTION ANSWERING READING COMPREHENSION SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS TEXT CLASSIFICATION

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

ACL 2020 huggingface/transformers

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

DENOISING MACHINE TRANSLATION NATURAL LANGUAGE INFERENCE QUESTION ANSWERING TEXT GENERATION

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ICLR 2020 huggingface/transformers

Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not.

LANGUAGE MODELLING NATURAL LANGUAGE UNDERSTANDING

Reformer: The Efficient Transformer

ICLR 2020 huggingface/transformers

Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences.

LANGUAGE MODELLING

ColBERT: Using BERT Sentence Embedding for Humor Detection

27 Apr 2020huggingface/transformers

In this paper, we describe a novel approach for detecting humor in short texts using BERT sentence embedding.

HUMOR DETECTION SENTENCE EMBEDDING

FlauBERT: Unsupervised Language Model Pre-training for French

LREC 2020 huggingface/transformers

Language models have become a key step to achieve state-of-the art results in many different Natural Language Processing (NLP) tasks.

LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE TEXT CLASSIFICATION WORD SENSE DISAMBIGUATION