TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Language Modelling	One Billion Word	10 LSTM+CNN inputs + SNM10-SKIP (ensemble)	PPL	23.7	# 8
Language Modelling	One Billion Word	10 LSTM+CNN inputs + SNM10-SKIP (ensemble)	Number of params	43B	# 1
Language Modelling	One Billion Word	LSTM-8192-1024 + CNN Input	PPL	30.0	# 16
Language Modelling	One Billion Word	LSTM-8192-1024 + CNN Input	Number of params	1.04B	# 1
Language Modelling	One Billion Word	LSTM-8192-1024	PPL	30.6	# 17
Language Modelling	One Billion Word	LSTM-8192-1024	Number of params	1.8B	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/exploring-the-limits-of-language-modeling/language-modelling-on-one-billion-word)](https://paperswithcode.com/sota/language-modelling-on-one-billion-word?p=exploring-the-limits-of-language-modeling)`

Exploring the Limits of Language Modeling

7 Feb 2016 · Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, Yonghui Wu ·

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language. We perform an exhaustive study on techniques such as character Convolutional Neural Networks or Long-Short Term Memory, on the One Billion Word Benchmark. Our best single model significantly improves state-of-the-art perplexity from 51.3 down to 30.0 (whilst reducing the number of parameters by a factor of 20), while an ensemble of models sets a new record by improving perplexity from 41.0 down to 23.7. We also release these models for the NLP and ML community to study and improve upon.

PDF Abstract

Code

Add Remove Mark official

tensorflow/models

76,589

tensorflow/models

65,332

dmlc/gluon-nlp

2,548

DeepMark/deepmark

351

rafaljozefowicz/lm

164

See all 10 implementations

Tasks

Add Remove

Language Modelling

Datasets

ImageNet Billion Word Benchmark One Billion Word Benchmark

Results from the Paper

Edit

Ranked #8 on Language Modelling on One Billion Word

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Language Modelling	One Billion Word	10 LSTM+CNN inputs + SNM10-SKIP (ensemble)	PPL	23.7	# 8	Compare
Language Modelling	One Billion Word	10 LSTM+CNN inputs + SNM10-SKIP (ensemble)	Number of params	43B	# 1	Compare
Language Modelling	One Billion Word	LSTM-8192-1024 + CNN Input	PPL	30.0	# 16	Compare
Language Modelling	One Billion Word	LSTM-8192-1024 + CNN Input	Number of params	1.04B	# 1	Compare
Language Modelling	One Billion Word	LSTM-8192-1024	PPL	30.6	# 17	Compare
Language Modelling	One Billion Word	LSTM-8192-1024	Number of params	1.8B	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Exploring the Limits of Language Modeling

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove