Text Classification

1105 papers with code • 93 benchmarks • 136 datasets

Text Classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.

Text Classification problems include emotion classification, news classification, citation intent classification, among others. Benchmark datasets for evaluating text classification capabilities include GLUE, AGNews, among others.

In recent years, deep learning techniques like XLNet and RoBERTa have attained some of the biggest performance jumps for text classification problems.

( Image credit: Text Classification Algorithms: A Survey )

Libraries

Use these libraries to find Text Classification models and implementations

Latest papers with no code

VertAttack: Taking advantage of Text Classifiers' horizontal vision

no code yet • 12 Apr 2024

In contrast, humans are easily able to recognize and read words written both horizontally and vertically.

Exploring Contrastive Learning for Long-Tailed Multi-Label Text Classification

no code yet • 12 Apr 2024

In this paper, we conduct an in-depth study of supervised contrastive learning and its influence on representation in MLTC context.

Interactive Prompt Debugging with Sequence Salience

no code yet • 11 Apr 2024

We present Sequence Salience, a visual tool for interactive prompt debugging with input salience methods.

Semantic Stealth: Adversarial Text Attacks on NLP Using Several Methods

no code yet • 8 Apr 2024

In various real-world applications such as machine translation, sentiment analysis, and question answering, a pivotal role is played by NLP models, facilitating efficient communication and decision-making processes in domains ranging from healthcare to finance.

Text clustering applied to data augmentation in legal contexts

no code yet • 8 Apr 2024

Data analysis and machine learning are of preeminent importance in the legal domain, especially in tasks like clustering and text classification.

Adversarial Attacks and Dimensionality in Text Classifiers

no code yet • 3 Apr 2024

For all of the aforementioned studies, we have run tests on multiple models with varying dimensionality and used a word-vector level adversarial attack to substantiate the findings.

Enhancing Low-Resource LLMs Classification with PEFT and Synthetic Data

no code yet • 3 Apr 2024

Large Language Models (LLMs) operating in 0-shot or few-shot settings achieve competitive results in Text Classification tasks.

Ukrainian Texts Classification: Exploration of Cross-lingual Knowledge Transfer Approaches

no code yet • 2 Apr 2024

Despite the extensive amount of labeled datasets in the NLP text classification field, the persistent imbalance in data availability across various languages remains evident.

AISPACE at SemEval-2024 task 8: A Class-balanced Soft-voting System for Detecting Multi-generator Machine-generated Text

no code yet • 1 Apr 2024

SemEval-2024 Task 8 provides a challenge to detect human-written and machine-generated text.

Shortcuts Arising from Contrast: Effective and Covert Clean-Label Attacks in Prompt-Based Learning

no code yet • 30 Mar 2024

In addressing this issue, we are inspired by the notion that a backdoor acts as a shortcut and posit that this shortcut stems from the contrast between the trigger and the data utilized for poisoning.