TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Long-range modeling	LRA	S4	ListOps	59.60	# 10
Long-range modeling	LRA	S4	Text	86.82	# 11
Long-range modeling	LRA	S4	Retrieval	90.90	# 9
Long-range modeling	LRA	S4	Image	88.65	# 5
Long-range modeling	LRA	S4	Pathfinder	94.20	# 8
Long-range modeling	LRA	S4	Avg	86.09	# 7
Long-range modeling	LRA	S4	Pathfinder-X	96.35	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/how-to-train-your-hippo-state-space-models/long-range-modeling-on-lra)](https://paperswithcode.com/sota/long-range-modeling-on-lra?p=how-to-train-your-hippo-state-space-models)`

How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections

24 Jun 2022 · Albert Gu, Isys Johnson, Aman Timalsina, Atri Rudra, Christopher Ré ·

Linear time-invariant state space models (SSM) are a classical model from engineering and statistics, that have recently been shown to be very promising in machine learning through the Structured State Space sequence model (S4). A core component of S4 involves initializing the SSM state matrix to a particular matrix called a HiPPO matrix, which was empirically important for S4's ability to handle long sequences. However, the specific matrix that S4 uses was actually derived in previous work for a particular time-varying dynamical system, and the use of this matrix as a time-invariant SSM had no known mathematical interpretation. Consequently, the theoretical mechanism by which S4 models long-range dependencies actually remains unexplained. We derive a more general and intuitive formulation of the HiPPO framework, which provides a simple mathematical interpretation of S4 as a decomposition onto exponentially-warped Legendre polynomials, explaining its ability to capture long dependencies. Our generalization introduces a theoretically rich class of SSMs that also lets us derive more intuitive S4 variants for other bases such as the Fourier basis, and explains other aspects of training S4, such as how to initialize the important timescale parameter. These insights improve S4's performance to 86% on the Long Range Arena benchmark, with 96% on the most difficult Path-X task.

PDF Abstract

Code

Add Remove Mark official

hazyresearch/state-spaces official

2,113

Tasks

Add Remove

Long-range modeling

Datasets

LRA

Results from the Paper

Edit

Ranked #7 on Long-range modeling on LRA

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Long-range modeling	LRA	S4	ListOps	59.60	# 10	Compare
			Text	86.82	# 11	Compare
			Retrieval	90.90	# 9	Compare
			Image	88.65	# 5	Compare
			Pathfinder	94.20	# 8	Compare
			Avg	86.09	# 7	Compare
			Pathfinder-X	96.35	# 7	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove