Extreme Multi-Label Classification

29 papers with code • 0 benchmarks • 2 datasets

Extreme Multi-Label Classification is a supervised learning problem where an instance may be associated with multiple labels. The two main problems are the unbalanced labels in the dataset and the amount of different labels.

Libraries

Use these libraries to find Extreme Multi-Label Classification models and implementations
3 papers
493

Most implemented papers

Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous Multi-GPU Servers

YMA33/HeteroGPU 13 Oct 2021

We address these challenges with Adaptive SGD, an adaptive elastic model averaging stochastic gradient descent algorithm for heterogeneous multi-GPUs that is characterized by dynamic scheduling, adaptive batch size scaling, and normalized model merging.

Propensity-scored Probabilistic Label Trees

mwydmuch/napkinXC 20 Oct 2021

Extreme multi-label classification (XMLC) refers to the task of tagging instances with small subsets of relevant labels coming from an extremely large set of all possible labels.

ELIAS: End-to-End Learning to Index and Search in Large Output Spaces

nilesh2797/elias 16 Oct 2022

A popular approach for dealing with the large label space is to arrange the labels into a shallow tree-based index and then learn an ML model to efficiently search this index via beam search.

Cluster-Guided Label Generation in Extreme Multi-Label Classification

alexa/xlgen-eacl-2023 17 Feb 2023

For extreme multi-label classification (XMC), existing classification-based models poorly perform for tail labels and often ignore the semantic relations among labels, like treating "Wikipedia" and "Wiki" as independent and separate labels.

PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation

amzn/pecos 21 May 2023

Unlike most existing XMC frameworks that treat labels and input instances as featureless indicators and independent entries, PINA extracts information from the label metadata and the correlations among training instances.

MDACE: MIMIC Documents Annotated with Code Evidence

3mcloud/MDACE ACL 2023

In this paper, we introduce MDACE, the first publicly available code evidence dataset, which is built on a subset of the MIMIC-III clinical records.

Dual-Encoders for Extreme Multi-Label Classification

nilesh2797/dexml 16 Oct 2023

We propose decoupled softmax loss - a simple modification to the InfoNCE loss - that overcomes the limitations of existing contrastive losses.

Dense Retrieval as Indirect Supervision for Large-space Decision Making

luka-group/ddr 28 Oct 2023

Many discriminative natural language understanding (NLU) tasks have large label spaces.

ICXML: An In-Context Learning Framework for Zero-Shot Extreme Multi-Label Classification

yaxinzhuars/icxml 16 Nov 2023

This paper focuses on the task of Extreme Multi-Label Classification (XMC) whose goal is to predict multiple labels for each instance from an extremely large label space.