Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

ACL 2019 huggingface/pytorch-transformers

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.

LANGUAGE MODELLING

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 Jul 2019huggingface/pytorch-transformers

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

 SOTA for Question Answering on SQuAD2.0 dev (using extra training data)

LANGUAGE MODELLING LEXICAL SIMPLIFICATION NATURAL LANGUAGE INFERENCE QUESTION ANSWERING READING COMPREHENSION SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS

Phrase-Based & Neural Unsupervised Machine Translation

EMNLP 2018 huggingface/pytorch-transformers

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.

UNSUPERVISED MACHINE TRANSLATION

Language Models are Unsupervised Multitask Learners

Preprint 2019 huggingface/pytorch-transformers

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.

 SOTA for Language Modelling on Text8 (using extra training data)

COMMON SENSE REASONING DOCUMENT SUMMARIZATION LANGUAGE MODELLING MACHINE TRANSLATION QUESTION ANSWERING READING COMPREHENSION TEXT GENERATION

Efficient softmax approximation for GPUs

ICML 2017 huggingface/pytorch-transformers

We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.

Cannot find the paper you are looking for? You can Submit a new open access paper.