Robust early-learning: Hindering the memorization of noisy labels

The \textit{memorization effects} of deep networks show that they will first memorize training data with clean labels and then those with noisy labels. The \textit{early stopping} method therefore can be exploited for learning with noisy labels. However, the side effect brought by noisy labels will influence the memorization of clean labels before early stopping. In this paper, motivated by the \textit{lottery ticket hypothesis} which shows that only partial parameters are important for generalization, we find that only partial parameters are important for fitting clean labels and generalize well, which we term as \textit{critical parameters}; while the other parameters tend to fit noisy labels and cannot generalize well, which we term as \textit{non-critical parameters}. Based on this, we propose \textit{robust early-learning} to reduce the side effect of noisy labels before early stopping and thus enhance the memorization of clean labels. Specifically, in each iteration, we divide all parameters into the critical and non-critical ones, and then perform different update rules for different types of parameters. Extensive experiments on benchmark-simulated and real-world label-noise datasets demonstrate the superiority of the proposed method over the state-of-the-art label-noise learning methods.

PDF Abstract

Results from the Paper


Ranked #32 on Image Classification on mini WebVision 1.0 (ImageNet Top-1 Accuracy metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Image Classification mini WebVision 1.0 CDR (Inception-ResNet-v2) ImageNet Top-1 Accuracy 61.85 # 32

Methods