In this paper, we present HuggingFace's Transformers library, a library for state-of-the-art NLP, making these developments available to the community by gathering state-of-the-art general-purpose pretrained models under a unified API together with an ecosystem of libraries, examples, tutorials and scripts targeting many downstream NLP tasks.
This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding.
We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating).
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
SOTA for Question Answering on SQuAD2.0 dev (using extra training data)
Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.
SOTA for Language Modelling on Hutter Prize
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
SOTA for Common Sense Reasoning on SWAG
As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.
#5 best model for Semantic Textual Similarity on MRPC
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.
#2 best model for Machine Translation on WMT2016 English-Russian