TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Language Modelling	enwik8	Hypernetworks	Bit per Character (BPC)	1.34	# 40
Language Modelling	enwik8	Hypernetworks	Number of params	27M	# 34
Language Modelling	Penn Treebank (Character Level)	2-layer Norm HyperLSTM	Bit per Character (BPC)	1.219	# 14
Language Modelling	Penn Treebank (Character Level)	2-layer Norm HyperLSTM	Number of params	14.4M	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hypernetworks/language-modelling-on-penn-treebank-character)](https://paperswithcode.com/sota/language-modelling-on-penn-treebank-character?p=hypernetworks)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hypernetworks/language-modelling-on-enwiki8)](https://paperswithcode.com/sota/language-modelling-on-enwiki8?p=hypernetworks)`

HyperNetworks

27 Sep 2016 · David Ha, Andrew Dai, Quoc V. Le ·

This work explores hypernetworks: an approach of using a one network, also known as a hypernetwork, to generate the weights for another network. Hypernetworks provide an abstraction that is similar to what is found in nature: the relationship between a genotype - the hypernetwork - and a phenotype - the main network. Though they are also reminiscent of HyperNEAT in evolution, our hypernetworks are trained end-to-end with backpropagation and thus are usually faster. The focus of this work is to make hypernetworks useful for deep convolutional networks and long recurrent networks, where hypernetworks can be viewed as relaxed form of weight-sharing across layers. Our main result is that hypernetworks can generate non-shared weights for LSTM and achieve near state-of-the-art results on a variety of sequence modelling tasks including character-level language modelling, handwriting generation and neural machine translation, challenging the weight-sharing paradigm for recurrent networks. Our results also show that hypernetworks applied to convolutional networks still achieve respectable results for image recognition tasks compared to state-of-the-art baseline models while requiring fewer learnable parameters.

PDF Abstract

Code

Add Remove Mark official

labmlai/annotated_deep_learning_pap…

↳ View annotated code at

labml.ai

48,096

g1910/HyperNetworks

246

chrhenning/hypnettorch

109

gahaalt/continual-learning-overview

gahaalt/continual-learning-with-hyp…

See all 8 implementations

Tasks

Add Remove

Handwriting generation

Language Modelling

Machine Translation

Translation

Datasets

Penn Treebank

Results from the Paper

Add Remove

Ranked #14 on Language Modelling on Penn Treebank (Character Level)

Get a GitHub badge

Results from Other Papers

Task	Dataset	Model	Metric Name	Metric Value	Rank	Compare
Language Modelling	enwik8	Hypernetworks	Bit per Character (BPC)	1.34	# 40	See all
Language Modelling	enwik8	Hypernetworks	Number of params	27M	# 34	See all
Language Modelling	Penn Treebank (Character Level)	2-layer Norm HyperLSTM	Bit per Character (BPC)	1.219	# 14	See all
Language Modelling	Penn Treebank (Character Level)	2-layer Norm HyperLSTM	Number of params	14.4M	# 7	See all

Methods

Add Remove

HyperNetwork

Edit Social Preview

HyperNetworks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove