Early-Stopping for Meta-Learning: Estimating Generalization from the Activation Dynamics

29 Sep 2021 · Simon Guiroy, Christopher Pal, Sarath Chandar ·

Early-stopping, a fundamental element of machine learning practice, aims to halt the training of a model when it reaches optimal generalization to unseen examples, right before the overfitting regime on the training data. Meta-Learning algorithms for few-shot learning aim to train neural networks capable of adapting to novel tasks using only a few labelled examples, in order to achieve good generalization. However, current early-stopping practices in meta-learning are problematic since there may be an arbitrary large distributional shift between the meta-validation set coming from the training data, and the meta-test set. This is even more critical in few-shot transfer learning where the meta-test set comes from a different target dataset. To this end, we empirically show that as meta-training progresses, a model's generalization to a target distribution of novel tasks can be estimated by analysing the dynamics of its neural activations. We propose a method for estimating optimal early-stopping time from the neural activation dynamics of just a few unlabelled support examples from the target distribution, and we demonstrate its performance with various meta-learning algorithms, few-shot datasets and transfer regimes.

PDF Abstract