Text Categorization

41 papers with code • 0 benchmarks • 6 datasets

Text Categorization is the task of automatically assigning pre-defined categories to documents written in natural languages. Several types of Text Categorization have been studied, each of which deals with different types of documents and categories, such as topic categorization to detect discussed topics (e.g., sports, politics), spam detection, and sentiment classification to determine the sentiment typically in product or movie reviews.

Source: Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

Benchmarks

Add a Result

These leaderboards are used to track progress in Text Categorization

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Text Categorization models and implementations

sergioburdisso/pyss3

2 papers

331

Datasets

Most implemented papers

Most implemented Social Latest No code

Discriminating between Similar Languages using Weighted Subword Features

adbar/vardial-experiments • WS 2017

The present contribution revolves around a contrastive subword n-gram model which has been tested in the Discriminating between Similar Languages shared task.

Paper
Code

An Automated Text Categorization Framework based on Hyperparameter Optimization

INGEOTEC/microTC • 6 Apr 2017

The compared datasets include several problems like topic and polarity classification, spam detection, user profiling and authorship attribution.

Paper
Code

Convex Formulation of Multiple Instance Learning from Positive and Unlabeled Bags

levelfour/pumil • 22 Apr 2017

Multiple instance learning (MIL) is a variation of traditional supervised learning problems where data (referred to as bags) are composed of sub-elements (referred to as instances) and only bag labels are available.

Paper
Code

Authorship Attribution Using the Chaos Game Representation

catalinstoean/FCGR-LR • 14 Feb 2018

Validation results for the trained classifiers are competitive with the best methods in prior literature.

Paper
Code

Multilingual Multi-class Sentiment Classification Using Convolutional Neural Networks

SamihYounes/senti-cnn • LREC 2018

Paper
Code

Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification

y3nk0/Graph-Based-TC • • WS 2018

Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words(GoW) model in which each document is represented by a graph that encodes relationships between the different terms.

Paper
Code

Topic or Style? Exploring the Most Useful Features for Authorship Attribution

yunitata/coling2018 • • COLING 2018

Approaches to authorship attribution, the task of identifying the author of a document, are based on analysis of individuals{'} writing style and/or preferred topics.

Paper
Code

Document Informed Neural Autoregressive Topic Models

pgcool/iDocNADE • 11 Aug 2018

Context information around words helps in determining their actual meaning, for example "networks" used in contexts of artificial neural networks or biological neuron networks.

Paper
Code

SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors

luisespinosa/seven • COLING 2018

For example, by examining clusters of relation vectors, we observe that relational similarities can be identified at a more abstract level than with traditional word vector differences.

Paper
Code

Using the Tsetlin Machine to Learn Human-Interpretable Rules for High-Accuracy Text Categorization with Medical Applications

cair/TextUnderstandingTsetlinMachine • • 12 Sep 2018

The Tsetlin Machine either performs on par with or outperforms all of the evaluated methods on both the 20 Newsgroups and IMDb datasets, as well as on a non-public clinical dataset.

Paper
Code

Text Categorization

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result