Where is the bottleneck in long-tailed classification?

29 Sep 2021 · Zaid Khan, Yun Fu ·

A commonly held belief in deep-learning based long-tailed classiﬁcation is that the representations learned from long-tailed data are ”good enough” and the performance bottleneck is the classiﬁcation head atop the representation learner. We design experiments to investigate this folk wisdom, and ﬁnd that representations learned from long-tailed data distributions substantially differ from the representations learned from ”normal” data distributions. We show that the long-tailed representations are volatile and brittle with respect to the true data distribution. Compared to the representations learned from the true, balanced distributions, long-tailed representations fail to localize tail classes and display vastly worse inter-class separation and intra-class compactness when unseen samples from the true data distribution are embedded into the feature space. We provide an explanation for why data augmentation helps long-tailed classiﬁcation despite leaving the dataset imbalance unchanged — it promotes inter-class separation, intra-class compactness, and improves localization of tail classes w.r.t to the true data distribution.

PDF Abstract