Extreme Multi-Label Classification
29 papers with code • 0 benchmarks • 2 datasets
Extreme Multi-Label Classification is a supervised learning problem where an instance may be associated with multiple labels. The two main problems are the unbalanced labels in the dataset and the amount of different labels.
Benchmarks
These leaderboards are used to track progress in Extreme Multi-Label Classification
Libraries
Use these libraries to find Extreme Multi-Label Classification models and implementationsMost implemented papers
Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous Multi-GPU Servers
We address these challenges with Adaptive SGD, an adaptive elastic model averaging stochastic gradient descent algorithm for heterogeneous multi-GPUs that is characterized by dynamic scheduling, adaptive batch size scaling, and normalized model merging.
Propensity-scored Probabilistic Label Trees
Extreme multi-label classification (XMLC) refers to the task of tagging instances with small subsets of relevant labels coming from an extremely large set of all possible labels.
ELIAS: End-to-End Learning to Index and Search in Large Output Spaces
A popular approach for dealing with the large label space is to arrange the labels into a shallow tree-based index and then learn an ML model to efficiently search this index via beam search.
Cluster-Guided Label Generation in Extreme Multi-Label Classification
For extreme multi-label classification (XMC), existing classification-based models poorly perform for tail labels and often ignore the semantic relations among labels, like treating "Wikipedia" and "Wiki" as independent and separate labels.
PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation
Unlike most existing XMC frameworks that treat labels and input instances as featureless indicators and independent entries, PINA extracts information from the label metadata and the correlations among training instances.
MDACE: MIMIC Documents Annotated with Code Evidence
In this paper, we introduce MDACE, the first publicly available code evidence dataset, which is built on a subset of the MIMIC-III clinical records.
Dual-Encoders for Extreme Multi-Label Classification
We propose decoupled softmax loss - a simple modification to the InfoNCE loss - that overcomes the limitations of existing contrastive losses.
Dense Retrieval as Indirect Supervision for Large-space Decision Making
Many discriminative natural language understanding (NLU) tasks have large label spaces.
ICXML: An In-Context Learning Framework for Zero-Shot Extreme Multi-Label Classification
This paper focuses on the task of Extreme Multi-Label Classification (XMC) whose goal is to predict multiple labels for each instance from an extremely large label space.