Document Classification

206 papers with code • 19 benchmarks • 15 datasets

Document Classification is a procedure of assigning one or more labels to a document from a predetermined set of labels.

Source: Long-length Legal Document Classification

Libraries

Use these libraries to find Document Classification models and implementations

Most implemented papers

Geometric deep learning on graphs and manifolds using mixture model CNNs

dmlc/dgl CVPR 2017

Recently, there has been an increasing interest in geometric deep learning, attempting to generalize deep learning methods to non-Euclidean structured data such as graphs and manifolds, with a variety of applications from the domains of network analysis, computational social science, or computer graphics.

Learning to Skim Text

tsujuifu/pytorch_lstm-shuttle ACL 2017

Recurrent Neural Networks are showing much promise in many sub-areas of natural language processing, ranging from document classification to machine translation to automatic question answering.

MultiFiT: Efficient Multi-lingual Language Model Fine-tuning

n-waves/multifit IJCNLP 2019

Pretrained language models are promising particularly for low-resource languages as they only require unlabelled data.

Pairwise Multi-Class Document Classification for Semantic Relations between Wikipedia Articles

malteos/semantic-document-relations 22 Mar 2020

In this paper, we model the problem of finding the relationship between two documents as a pairwise document classification task.

HDLTex: Hierarchical Deep Learning for Text Classification

kk7nc/HDLTex 24 Sep 2017

This is because along with this growth in the number of documents has come an increase in the number of categories.

Combining Similarity Features and Deep Representation Learning for Stance Detection in the Context of Checking Fake News

LuisPB7/fnc-msc 2 Nov 2018

Specifically, we use bi-directional Recurrent Neural Networks, together with max-pooling over the temporal/sequential dimension and neural attention, for representing (i) the headline, (ii) the first two sentences of the news article, and (iii) the entire news article.

DocBERT: BERT for Document Classification

castorini/hedwig 17 Apr 2019

We present, to our knowledge, the first application of BERT to document classification.

Multimodal deep networks for text and image-based document classification

Quicksign/ocrized-text-dataset 15 Jul 2019

Classification of document images is a critical step for archival of old manuscripts, online subscription and administrative procedures.

Hierarchical Transformers for Long Document Classification

helmy-elrais/RoBERT_Recurrence_over_BERT 23 Oct 2019

BERT, which stands for Bidirectional Encoder Representations from Transformers, is a recently introduced language representation model based upon the transfer learning paradigm.