Measuring the Effectiveness of Self-Supervised Learning using Calibrated Learning Curves

29 Sep 2021  ·  Andrei Atanov, Shijian Xu, Onur Beker, Andrey Filatov, Amir Zamir ·

Self-supervised learning has witnessed remarkable progress in recent years, in particular with the introduction of augmentation-based contrastive methods. While a number of large-scale empirical studies on the performance of self-supervised pre-training have been conducted, there isn't yet an agreed upon set of control baselines, evaluation practices, and metrics to report. We identify this as an important angle of investigation and propose an evaluation standard that aims to quantify and communicate transfer learning performance in an informative yet accessible setup. This is done by baking in a number of key control baselines in the evaluation method, particularly the blind guess (quantifying the dataset bias), the scratch model (quantifying the architectural contribution), and the gold standard (quantifying the upper-bound). We further provide a number of experiments to demonstrate how the proposed evaluation can be employed in empirical studies of basic questions -- for example, whether the effectiveness of existing self-supervised learning methods is skewed towards image classification versus other tasks, such as dense pixel-wise predictions.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here