Optical Flow Estimation
655 papers with code • 10 benchmarks • 34 datasets
Optical Flow Estimation is a computer vision task that involves computing the motion of objects in an image or a video sequence. The goal of optical flow estimation is to determine the movement of pixels or features in the image, which can be used for various applications such as object tracking, motion analysis, and video compression.
Approaches for optical flow estimation include correlation-based, block-matching, feature tracking, energy-based, and more recently gradient-based.
Further readings:
Definition source: Devon: Deformable Volume Network for Learning Optical Flow
Image credit: Optical Flow Estimation
Libraries
Use these libraries to find Optical Flow Estimation models and implementationsDatasets
Latest papers
Rethinking Low-quality Optical Flow in Unsupervised Surgical Instrument Segmentation
Video-based surgical instrument segmentation plays an important role in robot-assisted surgeries.
LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding
Despite progress in video-language modeling, the computational challenge of interpreting long-form videos in response to task-specific linguistic queries persists, largely due to the complexity of high-dimensional video data and the misalignment between language and visual cues over space and time.
CREMA: Multimodal Compositional Video Reasoning via Efficient Modular Adaptation and Fusion
Furthermore, we propose a fusion module designed to compress multimodal queries, maintaining computational efficiency in the LLM while combining additional modalities.
Taylor Videos for Action Recognition
Addressing these challenges, we propose the Taylor video, a new video format that highlights the dominate motions (e. g., a waving hand) in each of its frames named the Taylor frame.
Recurrent Partial Kernel Network for Efficient Optical Flow Estimation
However, this impacts the widespread adoption of optical flow methods and makes it harder to train more general models since the optical flow data is hard to obtain.
Multimodal Action Quality Assessment
To leverage multimodal information for AQA, i. e., RGB, optical flow and audio information, we propose a Progressive Adaptive Multimodal Fusion Network (PAMFN) that separately models modality-specific information and mixed-modality information.
VONet: Unsupervised Video Object Learning With Parallel U-Net Attention and Object-wise Sequential VAE
Unsupervised video object learning seeks to decompose video scenes into structural object representations without any supervision from depth, optical flow, or segmentation.
Deep Linear Array Pushbroom Image Restoration: A Degradation Pipeline and Jitter-Aware Restoration Network
Both the proposed JARNet and LAP image synthesis pipeline establish a foundation for addressing this intricate challenge.
RomniStereo: Recurrent Omnidirectional Stereo Matching
To bridge the gap between OSM and RAFT, we mainly propose an opposite adaptive weighting scheme to seamlessly transform the outputs of spherical sweeping of OSM into the required inputs for the recurrent update, thus creating a recurrent omnidirectional stereo matching (RomniStereo) algorithm.
Rethinking RAFT for Efficient Optical Flow
To address these problems, this paper proposes a novel approach based on the RAFT framework.