Text Categorization
41 papers with code • 0 benchmarks • 6 datasets
Text Categorization is the task of automatically assigning pre-defined categories to documents written in natural languages. Several types of Text Categorization have been studied, each of which deals with different types of documents and categories, such as topic categorization to detect discussed topics (e.g., sports, politics), spam detection, and sentiment classification to determine the sentiment typically in product or movie reviews.
Source: Effective Use of Word Order for Text Categorization with Convolutional Neural Networks
Benchmarks
These leaderboards are used to track progress in Text Categorization
Libraries
Use these libraries to find Text Categorization models and implementationsLatest papers
Improving Document Classification with Multi-Sense Embeddings
Through extensive experiments on multiple real-world datasets, we show that SCDV-MS embeddings outperform previous state-of-the-art embeddings on multi-class and multi-label text categorization tasks.
t-SS3: a text classifier with dynamic n-grams for early risk detection over text streams
SS3 was created to deal with ERD problems naturally since: it supports incremental training and classification over text streams, and it can visually explain its rationale.
Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks
LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label.
Ensemble Quantile Classifier
It is also shown that the ensemble quantile classifier is Bayes optimal under suitable assumptions with asymmetric Laplace distribution inputs.
Text Categorization by Learning Predominant Sense of Words as Auxiliary Task
Distributions of the senses of words are often highly skewed and give a strong influence of the domain of a document.
Leap-LSTM: Enhancing Long Short-Term Memory for Text Categorization
Compared to previous models which can also skip words, our model achieves better trade-offs between performance and efficiency.
Rep the Set: Neural Networks for Learning Set Representations
In several domains, data objects can be decomposed into sets of simpler objects.
Learning Graph Pooling and Hybrid Convolutional Operations for Text Representations
Another limitation of GCN when used on graph-based text representation tasks is that, GCNs do not consider the order information of nodes in graph.
Structure-Aware Convolutional Neural Networks
Convolutional neural networks (CNNs) are inherently subject to invariable filters that can only aggregate local inputs with the same topological structures.
HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization
The lower the HS level, the less the categorization performance.