Language modeling is the task of predicting the next word or character in a document.
|Trend||Dataset||Best Method||Paper title||Paper||Code||Compare|
In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding.
#8 best model for Language Modelling on One Billion Word
We propose a new benchmark corpus to be used for measuring progress in statistical language modeling.
#14 best model for Language Modelling on One Billion Word
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.
SOTA for Language Modelling on Text8 (using extra training data)
Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.
SOTA for Language Modelling on Hutter Prize
Feed-forward and convolutional architectures have recently been shown to achieve superior results on some sequence modeling tasks such as machine translation, with the added advantage that they concurrently process all inputs in the sequence, leading to easy parallelization and faster training times.
#10 best model for Machine Translation on WMT2014 English-German
We propose to improve the representation in sequence models by augmenting current approaches with an autoencoder that is forced to compress the sequence through an intermediate discrete latent space.
Recent advances in language modeling using recurrent neural networks have made it viable to model language as distributions over characters.
SOTA for Chunking on Penn Treebank
We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).
#3 best model for Sentiment Analysis on SST-5 Fine-grained classification