Learning Correspondence from the Cycle-Consistency of Time

CVPR 2019  ·  Xiaolong Wang, Allan Jabri, Alexei A. Efros ·

We introduce a self-supervised method for learning visual correspondence from unlabeled video. The main idea is to use cycle-consistency in time as free supervisory signal for learning visual representations from scratch. At training time, our model learns a feature map representation to be useful for performing cycle-consistent tracking. At test time, we use the acquired representation to find nearest neighbors across space and time. We demonstrate the generalizability of the representation -- without finetuning -- across a range of visual correspondence tasks, including video object segmentation, keypoint tracking, and optical flow. Our approach outperforms previous self-supervised methods and performs competitively with strongly supervised methods.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semi-Supervised Video Object Segmentation DAVIS 2017 (val) CycleTime Jaccard (Mean) 46.4 # 77
Jaccard (Recall) 50.0 # 28
F-measure (Mean) 50.0 # 78
F-measure (Recall) 48.0 # 27
J&F 48.7 # 79

Methods


No methods listed for this paper. Add relevant methods here