Flatten Long-Range Loss Landscapes for Cross-Domain Few-Shot Learning

1 Mar 2024  ·  Yixiong Zou, Yicong Liu, Yiman Hu, Yuhua Li, Ruixuan Li ·

Cross-domain few-shot learning (CDFSL) aims to acquire knowledge from limited training data in the target domain by leveraging prior knowledge transferred from source domains with abundant training samples. CDFSL faces challenges in transferring knowledge across dissimilar domains and fine-tuning models with limited training data. To address these challenges, we initially extend the analysis of loss landscapes from the parameter space to the representation space, which allows us to simultaneously interpret the transferring and fine-tuning difficulties of CDFSL models. We observe that sharp minima in the loss landscapes of the representation space result in representations that are hard to transfer and fine-tune. Moreover, existing flatness-based methods have limited generalization ability due to their short-range flatness. To enhance the transferability and facilitate fine-tuning, we introduce a simple yet effective approach to achieve long-range flattening of the minima in the loss landscape. This approach considers representations that are differently normalized as minima in the loss landscape and flattens the high-loss region in the middle by randomly sampling interpolated representations. We implement this method as a new normalization layer that replaces the original one in both CNNs and ViTs. This layer is simple and lightweight, introducing only a minimal number of additional parameters. Experimental results on 8 datasets demonstrate that our approach outperforms state-of-the-art methods in terms of average accuracy. Moreover, our method achieves performance improvements of up to 9\% compared to the current best approaches on individual datasets. Our code will be released.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here