Text Categorization
41 papers with code • 0 benchmarks • 6 datasets
Text Categorization is the task of automatically assigning pre-defined categories to documents written in natural languages. Several types of Text Categorization have been studied, each of which deals with different types of documents and categories, such as topic categorization to detect discussed topics (e.g., sports, politics), spam detection, and sentiment classification to determine the sentiment typically in product or movie reviews.
Source: Effective Use of Word Order for Text Categorization with Convolutional Neural Networks
Benchmarks
These leaderboards are used to track progress in Text Categorization
Libraries
Use these libraries to find Text Categorization models and implementationsLatest papers
A Model Ensemble Approach with LLM for Chinese Text Classification
Automatic medical text categorization can assist doctors in efficiently managing patient information.
Beyond original Research Articles Categorization via NLP
This work proposes a novel approach to text categorization -- for unknown categories -- in the context of scientific literature, using Natural Language Processing techniques.
Quantum Recurrent Neural Networks for Sequential Learning
Quantum neural network (QNN) is one of the promising directions where the near-term noisy intermediate-scale quantum (NISQ) devices could find advantageous applications against classical resources.
Improving Pre-Trained Weights Through Meta-Heuristics Fine-Tuning
Machine Learning algorithms have been extensively researched throughout the last decade, leading to unprecedented advances in a broad range of applications, such as image classification and reconstruction, object recognition, and text categorization.
Very Large Language Model as a Unified Methodology of Text Mining
Text data mining is the process of deriving essential information from language text.
Text Ranking and Classification using Data Compression
A well-known but rarely used approach to text categorization uses conditional entropy estimates computed using data compression tools.
Clustering Word Embeddings with Self-Organizing Maps. Application on LaRoSeDa -- A Large Romanian Sentiment Data Set
Romanian is one of the understudied languages in computational linguistics, with few resources available for the development of natural language processing tools.
NatCat: Weakly Supervised Text Classification with Naturally Annotated Resources
We describe NatCat, a large-scale resource for text classification constructed from three data sources: Wikipedia, Stack Exchange, and Reddit.
SeMemNN: A Semantic Matrix-Based Memory Neural Network for Text Classification
Text categorization is the task of assigning labels to documents written in a natural language, and it has numerous real-world applications including sentiment analysis as well as traditional topic assignment tasks.
PySS3: A Python package implementing a novel text classifier with visualization tools for Explainable AI
A recently introduced text classifier, called SS3, has obtained state-of-the-art performance on the CLEF's eRisk tasks.