Optical Flow Estimation

652 papers with code • 10 benchmarks • 33 datasets

Optical Flow Estimation is a computer vision task that involves computing the motion of objects in an image or a video sequence. The goal of optical flow estimation is to determine the movement of pixels or features in the image, which can be used for various applications such as object tracking, motion analysis, and video compression.

Approaches for optical flow estimation include correlation-based, block-matching, feature tracking, energy-based, and more recently gradient-based.

Benchmarks

Add a Result

These leaderboards are used to track progress in Optical Flow Estimation

Dataset	Best Model	Compare
Sintel-clean	GMFlow	See all
Sintel-final	FlowFormer	See all
KITTI 2015 (train)	DEQ-Flow-H	See all
KITTI 2015	CamLiRAFT	See all
KITTI 2012	CroCo-Flow	See all
Spring	CroCo-Flow	See all
Sintel Clean unsupervised	MDFlow	See all
Sintel Final unsupervised	UpFlow	See all
KITTI 2015 unsupervised	MDFlow	See all
KITTI 2012 unsupervised	ARFlow-MV	See all

Libraries

Use these libraries to find Optical Flow Estimation models and implementations

open-mmlab/mmflow

9 papers

892

neu-vig/ezflow

5 papers

128

neu-vi/ezflow

5 papers

128

Datasets

Subtasks

Video Stabilization

Latest papers

Most implemented Social Latest No code

RAPIDFlow: Recurrent Adaptable Pyramids with Iterative Decoding for Efficient Optical Flow Estimation

hmorimitsu/ptlflow • • IEEE International Conference on Robotics and Automation (ICRA) 2024

Extracting motion information from videos with optical flow estimation is vital in multiple practical robot applications.

200

01 May 2024

Paper
Code

FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent

dcharatan/flowmap • 23 Apr 2024

This paper introduces FlowMap, an end-to-end differentiable method that solves for precise camera poses, camera intrinsics, and per-frame dense depth of a video sequence.

230

23 Apr 2024

Paper
Code

Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization

princeton-vl/multislam_diffpose • • 23 Apr 2024

The backbone is trained end-to-end using a novel differentiable solver for wide-baseline two-view pose.

23 Apr 2024

Paper
Code

Moving Object Segmentation: All You Need Is SAM (and Flow)

Jyxarthur/flowsam • • 18 Apr 2024

The objective of this paper is motion segmentation -- discovering and segmenting the moving objects in a video.

146

18 Apr 2024

Paper
Code

SEVD: Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception

eventbasedvision/sevd • • 12 Apr 2024

In response to this gap, we present SEVD, a first-of-its-kind multi-view ego, and fixed perception synthetic event-based dataset using multiple dynamic vision sensors within the CARLA simulator.

12 Apr 2024

Paper
Code

DBA-Fusion: Tightly Integrating Deep Dense Visual Bundle Adjustment with Multiple Sensors for Large-Scale Localization and Mapping

great-whu/dba-fusion • • 20 Mar 2024

Visual simultaneous localization and mapping (VSLAM) has broad applications, with state-of-the-art methods leveraging deep neural networks for better robustness and applicability.

20 Mar 2024

Paper
Code

NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

neufieldrobotics/neuflow • • 15 Mar 2024

Given the features of the input images extracted at different spatial resolutions, global matching is employed to estimate an initial optical flow on the 1/16 resolution, capturing large displacement, which is then refined on the 1/8 resolution with lightweight CNN layers for better accuracy.

15 Mar 2024

Paper
Code

Rethinking Low-quality Optical Flow in Unsupervised Surgical Instrument Segmentation

wpr1018001/rethinking-low-quality-optical-flow • • 15 Mar 2024

Video-based surgical instrument segmentation plays an important role in robot-assisted surgeries.

15 Mar 2024

Paper
Code

LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding

bigai-nlco/lstp-chat • • 25 Feb 2024

Despite progress in video-language modeling, the computational challenge of interpreting long-form videos in response to task-specific linguistic queries persists, largely due to the complexity of high-dimensional video data and the misalignment between language and visual cues over space and time.

25 Feb 2024

Paper
Code

CREMA: Multimodal Compositional Video Reasoning via Efficient Modular Adaptation and Fusion

Yui010206/CREMA • • 8 Feb 2024

Furthermore, we propose a fusion module designed to compress multimodal queries, maintaining computational efficiency in the LLM while combining additional modalities.

08 Feb 2024

Paper
Code

Optical Flow Estimation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result