Document Classification

97 papers with code · Natural Language Processing
Subtask of Text Classification

Document Classification is a procedure of assigning one or more labels to a document from a predetermined set of labels.

Source: Long-length Legal Document Classification

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Greatest papers with code

Improving Language Understanding by Generative Pre-Training

Preprint 2018 huggingface/transformers

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.

DOCUMENT CLASSIFICATION LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE NATURAL LANGUAGE UNDERSTANDING QUESTION ANSWERING SEMANTIC SIMILARITY SEMANTIC TEXTUAL SIMILARITY

Graph Attention Networks

ICLR 2018 aymericdamien/TopDeepLearning

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations.

DOCUMENT CLASSIFICATION GRAPH EMBEDDING GRAPH REGRESSION LINK PREDICTION NODE CLASSIFICATION SKELETON BASED ACTION RECOGNITION

Semi-Supervised Classification with Graph Convolutional Networks

9 Sep 2016tkipf/gcn

We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs.

DOCUMENT CLASSIFICATION GRAPH CLASSIFICATION GRAPH REGRESSION NODE CLASSIFICATION SKELETON BASED ACTION RECOGNITION

Pre-Training with Whole Word Masking for Chinese BERT

19 Jun 2019ymcui/Chinese-BERT-wwm

In this technical report, we adapt whole word masking in Chinese text, that masking the whole word instead of masking Chinese characters, which could bring another challenge in Masked Language Model (MLM) pre-training task.

DOCUMENT CLASSIFICATION LANGUAGE MODELLING MACHINE READING COMPREHENSION NAMED ENTITY RECOGNITION NATURAL LANGUAGE INFERENCE SENTIMENT ANALYSIS

Modular Multimodal Architecture for Document Classification

9 Dec 2019microsoft/unilm

Page classification is a crucial component to any document analysis system, allowing for complex branching control flows for different components of a given document.

DOCUMENT CLASSIFICATION

Robust Cross-lingual Embeddings from Parallel Sentences

28 Dec 2019epfml/sent2vec

Recent advances in cross-lingual word embeddings have primarily relied on mapping-based methods, which project pretrained word embeddings from different languages into a shared space through a linear transformation.

CROSS-LINGUAL DOCUMENT CLASSIFICATION DOCUMENT CLASSIFICATION WORD EMBEDDINGS ZERO-SHOT CROSS-LINGUAL DOCUMENT CLASSIFICATION