Video Prediction

183 papers with code • 19 benchmarks • 24 datasets

Video Prediction is the task of predicting future frames given past video frames.

Gif credit: MAGVIT

Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Prediction

Dataset	Best Model	Compare
KTH	Grid-keypoints	See all
Moving MNIST	SimVP+gSTA-Sx10	See all
Kinetics-600 12 frames, 64x64	MAGVIT-v2	See all
Human3.6M	IAM4VP	See all
BAIR Robot Pushing	MAGVIT (-L-FP)	See all
Cityscapes 128x128	GHVAEs	See all
SynpickVP	MSPred	See all
CMU Mocap-2	Latent SDE	See all
Cityscapes	DMVFN	See all
KITTI	DMVFN	See all
CMU Mocap-1	ODE2VAE-KL	See all
DAVIS 2017	DMVFN	See all
Vimeo90K	OPT	See all
Colored dSprites	MGP-VAE (with geodesic loss)	See all
Sprites	MGP-VAE (with geodesic loss)	See all
YouTube-8M	SDCNet	See all
KTH 64x64 cond10 pred30	SRVP	See all
Something-Something V2	MAGVIT	See all
MPI Sintel	MCnet [villegas2017mcnet]	See all

Show all 19 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Video Prediction models and implementations

chengtan9907/simvpv2

10 papers

573

chengtan9907/OpenSTL

4 papers

573

Flunzmas/vp-suite

3 papers

tensorflow/tensor2tensor

2 papers

14,883

See all 6 libraries.

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

State-space Decomposition Model for Video Prediction Considering Long-term Motion Trend

no code yet • 17 Apr 2024

In this paper, we propose a state-space decomposition stochastic video prediction model that decomposes the overall video frame generation into deterministic appearance prediction and stochastic motion prediction.

Paper
Add Code

Predicting Long-horizon Futures by Conditioning on Geometry and Time

no code yet • 17 Apr 2024

To address both challenges, our key insight is to leverage the large-scale pretraining of image diffusion models which can handle multi-modality.

Paper
Add Code

TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes

no code yet • 27 Mar 2024

To address this issue, we introduce a novel task called Target-Aware Aerial Video Prediction, aiming to simultaneously predict future scenes and motion states of the target.

Paper
Add Code

Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes

no code yet • 20 Mar 2024

We propose a framework for probabilistic forecasting of dynamical systems based on generative modeling.

Paper
Add Code

CarbonNet: How Computer Vision Plays a Role in Climate Change? Application: Learning Geomechanics from Subsurface Geometry of CCS to Mitigate Global Warming

no code yet • 9 Mar 2024

We introduce a new approach using computer vision to predict the land surface displacement from subsurface geometry images for Carbon Capture and Sequestration (CCS).

Paper
Add Code

Rolling Diffusion Models

no code yet • 12 Feb 2024

Diffusion models have recently been increasingly applied to temporal data such as video, fluid mechanics simulations, or climate data.

Paper
Add Code

Predicting the Future with Simple World Models

no code yet • 31 Jan 2024

Abstracting the dynamics of the environment with simple models can have several benefits.

Paper
Add Code

A Survey on Video Prediction: From Deterministic to Generative Approaches

no code yet • 26 Jan 2024

Video prediction, a fundamental task in computer vision, aims to enable models to generate sequences of future frames based on existing video content.

Paper
Add Code

Adversarial Augmentation Training Makes Action Recognition Models More Robust to Realistic Video Distribution Shifts

no code yet • 21 Jan 2024

More precisely, we created dataset splits of HMDB-51 or UCF-101 for training, and Kinetics-400 for testing, using the subset of the classes that are overlapping in both train and test datasets.

Paper
Add Code

Key-point Guided Deformable Image Manipulation Using Diffusion Model

no code yet • 16 Jan 2024

In this paper, we introduce a Key-point-guided Diffusion probabilistic Model (KDM) that gains precise control over images by manipulating the object's key-point.

Paper
Add Code

Video Prediction

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result