ImbaGCD: Imbalanced Generalized Category Discovery

4 Dec 2023  ·  Ziyun Li, Ben Dai, Furkan Simsek, Christoph Meinel, Haojin Yang ·

Generalized class discovery (GCD) aims to infer known and unknown categories in an unlabeled dataset leveraging prior knowledge of a labeled set comprising known classes. Existing research implicitly/explicitly assumes that the frequency of occurrence for each category, whether known or unknown, is approximately the same in the unlabeled data. However, in nature, we are more likely to encounter known/common classes than unknown/uncommon ones, according to the long-tailed property of visual classes. Therefore, we present a challenging and practical problem, Imbalanced Generalized Category Discovery (ImbaGCD), where the distribution of unlabeled data is imbalanced, with known classes being more frequent than unknown ones. To address these issues, we propose ImbaGCD, A novel optimal transport-based expectation maximization framework that accomplishes generalized category discovery by aligning the marginal class prior distribution. ImbaGCD also incorporates a systematic mechanism for estimating the imbalanced class prior distribution under the GCD setup. Our comprehensive experiments reveal that ImbaGCD surpasses previous state-of-the-art GCD methods by achieving an improvement of approximately 2 - 4% on CIFAR-100 and 15 - 19% on ImageNet-100, indicating its superior effectiveness in solving the Imbalanced GCD problem.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here