Video Prediction

183 papers with code • 19 benchmarks • 24 datasets

Video Prediction is the task of predicting future frames given past video frames.

Gif credit: MAGVIT

Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Prediction

Dataset	Best Model	Compare
KTH	Grid-keypoints	See all
Moving MNIST	SimVP+gSTA-Sx10	See all
Kinetics-600 12 frames, 64x64	MAGVIT-v2	See all
Human3.6M	IAM4VP	See all
BAIR Robot Pushing	MAGVIT (-L-FP)	See all
Cityscapes 128x128	GHVAEs	See all
SynpickVP	MSPred	See all
CMU Mocap-2	Latent SDE	See all
Cityscapes	DMVFN	See all
KITTI	DMVFN	See all
CMU Mocap-1	ODE2VAE-KL	See all
DAVIS 2017	DMVFN	See all
Vimeo90K	OPT	See all
Colored dSprites	MGP-VAE (with geodesic loss)	See all
Sprites	MGP-VAE (with geodesic loss)	See all
YouTube-8M	SDCNet	See all
KTH 64x64 cond10 pred30	SRVP	See all
Something-Something V2	MAGVIT	See all
MPI Sintel	MCnet [villegas2017mcnet]	See all

Show all 19 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Video Prediction models and implementations

chengtan9907/simvpv2

10 papers

582

chengtan9907/OpenSTL

4 papers

581

Flunzmas/vp-suite

3 papers

tensorflow/tensor2tensor

2 papers

14,919

See all 6 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

PredNet and Predictive Coding: A Critical Review

RoshanRane/segmentation-moving-MNIST • 14 Jun 2019

We fill in the gap by evaluating PredNet both as an implementation of the predictive coding theory and as a self-supervised video prediction model using a challenging video action classification dataset.

Paper
Code

Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction

vincent-leguen/PhyDNet • • CVPR 2020

Leveraging physical knowledge described by partial differential equations (PDEs) is an appealing way to improve unsupervised video prediction methods.

Paper
Code

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning

thuml/predrnn-pytorch • • 17 Mar 2021

This paper models these structures by presenting PredRNN, a new recurrent network, in which a pair of memory cells are explicitly decoupled, operate in nearly independent transition manners, and finally form unified representations of the complex environment.

Paper
Code

Video Diffusion Models

lucidrains/make-a-video-pytorch • • 7 Apr 2022

Generating temporally coherent high fidelity video is an important milestone in generative modeling research.

Paper
Code

SimVP: Simpler yet Better Video Prediction

gaozhangyang/simvp-simpler-yet-better-video-prediction • • CVPR 2022

From CNN, RNN, to ViT, we have witnessed remarkable advancements in video prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated training strategies.

Paper
Code

Video Prediction Models as Rewards for Reinforcement Learning

alescontrela/viper • • NeurIPS 2023

A promising approach is to extract preferences for behaviors from unlabeled videos, which are widely available on the internet.

Paper
Code

Expert Gate: Lifelong Learning with a Network of Experts

ContinualAI/avalanche • • CVPR 2017

Further, the autoencoders inherently capture the relatedness of one task to another, based on which the most relevant prior model to be used for training a new expert, with finetuning or learning without-forgetting, can be selected.

Paper
Code

Predicting Deeper into the Future of Semantic Segmentation

facebookresearch/SegmPred • • ICCV 2017

The ability to predict and therefore to anticipate the future is an important attribute of intelligence.

Paper
Code

Learning to Generate Long-term Future via Hierarchical Prediction

rubenvillegas/icml2017hierchvid • • ICML 2017

To avoid inherent compounding errors in recursive pixel-level prediction, we propose to first estimate high-level structure in the input frames, then predict how that structure evolves in the future, and finally by observing a single frame from the past and the predicted high-level structure, we construct the future frames without having to observe any of the pixel-level predictions.

Paper
Code

Prediction Under Uncertainty with Error-Encoding Networks

mbhenaff/EEN • • 14 Nov 2017

In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty.

Paper
Code

Video Prediction

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result