We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
Ranked #1 on Common Sense Reasoning on SWAG
We present FLAIR, an NLP framework designed to facilitate training and distribution of state-of-the-art sequence labeling, text classification and language models.
We make all code and pre-trained models available to the research community for use and reproduction.
Ranked #6 on Named Entity Recognition on CoNLL 2003 (English) (using extra training data)
fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks.
Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data.
Lexically-constrained sequence decoding allows for explicit positive or negative phrase-based constraints to be placed on target output strings in generation tasks such as machine translation or monolingual text rewriting.
Pre-trained word vectors are ubiquitous in Natural Language Processing applications.
Generalization and reliability of multilingual translation often highly depend on the amount of available parallel data for each language pair of interest.
Neural network models for many NLP tasks have grown increasingly complex in recent years, making training and deployment more difficult.
Ranked #1 on Document Classification on IMDb-M
In this paper, we describe compare-mt, a tool for holistic analysis and comparison of the results of systems for language generation tasks such as machine translation.