Text Classification
1090 papers with code • 150 benchmarks • 147 datasets
Text Classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.
Text Classification problems include emotion classification, news classification, citation intent classification, among others. Benchmark datasets for evaluating text classification capabilities include GLUE, AGNews, among others.
In recent years, deep learning techniques like XLNet and RoBERTa have attained some of the biggest performance jumps for text classification problems.
( Image credit: Text Classification Algorithms: A Survey )
Libraries
Use these libraries to find Text Classification models and implementationsSubtasks
- Topic Models
- Document Classification
- Sentence Classification
- Emotion Classification
- Emotion Classification
- Multi-Label Text Classification
- Text Categorization
- Few-Shot Text Classification
- Semi-Supervised Text Classification
- Coherence Evaluation
- Toxic Comment Classification
- Citation Intent Classification
- Cross-Domain Text Classification
- Unsupervised Text Classification
- Satire Detection
- Hierarchical Text Classification of Blurbs (GermEval 2019)
- Variable Detection
Latest papers with no code
Language Models for Text Classification: Is In-Context Learning Enough?
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
LARA: Linguistic-Adaptive Retrieval-Augmented LLMs for Multi-Turn Intent Classification
Following the significant achievements of large language models (LLMs), researchers have employed in-context learning for text classification tasks.
On the Fragility of Active Learners
The impact of this study is in its insights for a practitioner: (a) the choice of text representation and classifier is as important as that of an AL technique, (b) choice of the right metric is critical in assessment of the latter, and, finally, (c) reported AL results must be holistically interpreted, accounting for variables other than just the query strategy.
VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding
The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks.
MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection
This paper presents the MasonTigers entry to the SemEval-2024 Task 8 - Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection.
Multi-Level Explanations for Generative Language Models
To address the challenges of text as output and long text inputs, we propose a general framework called MExGen that can be instantiated with different attribution algorithms.
Visual Analytics for Fine-grained Text Classification Models and Datasets
As a consequence, the semantic structures of datasets have become more complex, and model decisions more difficult to explain.
Vi-Mistral-X: Building a Vietnamese Language Model with Advanced Continual Pre-training
To address this issue, this paper presents vi-mistral-x, an innovative Large Language Model designed expressly for the Vietnamese language.
Simple Hack for Transformers against Heavy Long-Text Classification on a Time- and Memory-Limited GPU Service
Using the best hack found, we then compare 512, 256, and 128 tokens length.
CrossTune: Black-Box Few-Shot Classification with Label Enhancement
Training or finetuning large-scale language models (LLMs) requires substantial computation resources, motivating recent efforts to explore parameter-efficient adaptation to downstream tasks.