Video Prediction
183 papers with code • 19 benchmarks • 24 datasets
Video Prediction is the task of predicting future frames given past video frames.
Gif credit: MAGVIT
Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames
Libraries
Use these libraries to find Video Prediction models and implementationsDatasets
Latest papers
SkyGPT: Probabilistic Short-term Solar Forecasting Using Synthetic Sky Videos from Physics-constrained VideoGPT
Furthermore, we feed the generated future sky images from the video prediction models for 15-minute-ahead probabilistic solar forecasting for a 30-kW roof-top PV system, and compare it with an end-to-end deep learning baseline model SUNSET and a smart persistence model.
Fast Fourier Inception Networks for Occluded Video Prediction
Video prediction is a pixel-level task that generates future frames by employing the historical frames.
DDLP: Unsupervised Object-Centric Video Prediction with Deep Dynamic Latent Particles
We propose a new object-centric video prediction algorithm based on the deep latent particle (DLP) representation.
Video Diffusion Models with Local-Global Context Guidance
We construct a local-global context guidance strategy to capture the multi-perceptual embedding of the past fragment to boost the consistency of future prediction.
Video Prediction Models as Rewards for Reinforcement Learning
A promising approach is to extract preferences for behaviors from unlabeled videos, which are widely available on the internet.
Let's Think Frame by Frame with VIP: A Video Infilling and Prediction Dataset for Evaluating Video Chain-of-Thought
Despite exciting recent results showing vision-language systems' capacity to reason about images using natural language, their capacity for video reasoning remains under-explored.
VDT: General-purpose Video Diffusion Transformers via Mask Modeling
We also propose a unified spatial-temporal mask modeling mechanism, seamlessly integrated with the model, to cater to diverse video generation scenarios.
PastNet: Introducing Physical Inductive Biases for Spatio-temporal Video Prediction
In this paper, we investigate the challenge of spatio-temporal video prediction, which involves generating future videos based on historical data streams.
A Control-Centric Benchmark for Video Prediction
Video is a promising source of knowledge for embodied agents to learn models of the world's dynamics.
Multi-modal learning for geospatial vegetation forecasting
Our study breaks new ground by introducing GreenEarthNet, the first dataset specifically designed for high-resolution vegetation forecasting, and Contextformer, a novel deep learning approach for predicting vegetation greenness from Sentinel 2 satellite images with fine resolution across Europe.