Trending Research

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 Jun 2019zihangdai/xlnet

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

DOCUMENT RANKING LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE QUESTION ANSWERING READING COMPREHENSION SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS TEXT CLASSIFICATION

2,874
3.17 stars / hour

Exploring the Limits of Weakly Supervised Pretraining

ECCV 2018 facebookresearch/WSL-Images

ImageNet classification is the de facto pretraining task for these models.

 SOTA for Image Classification on ImageNet (using extra training data)

IMAGE CLASSIFICATION OBJECT DETECTION TRANSFER LEARNING

103
1.64 stars / hour

Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?

8 Aug 2017dbiir/UER-py

This article offers an empirical study on the different ways of encoding Chinese, Japanese, Korean (CJK) and English languages for text classification.

TEXT CLASSIFICATION

192
1.38 stars / hour

MASS: Masked Sequence to Sequence Pre-training for Language Generation

7 May 2019microsoft/MASS

Pre-training and fine-tuning, e. g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasks.

 SOTA for Machine Translation on WMT2016 Romanian-English (using extra training data)

TEXT GENERATION TEXT SUMMARIZATION UNSUPERVISED MACHINE TRANSLATION

154
1.05 stars / hour

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

KDD 2019 benedekrozemberczki/ClusterGCN

Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99. 36 on the PPI dataset, while the previous best result was 98. 71 by [16].

 SOTA for Node Classification on Cora (F1 metric )

GRAPH CLUSTERING NODE CLASSIFICATION

62
0.87 stars / hour

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

KDD 2019 benedekrozemberczki/ClusterGCN

Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99. 36 on the PPI dataset, while the previous best result was 98. 71 by [16].

GRAPH CLUSTERING LINK PREDICTION NODE CLASSIFICATION

62
0.87 stars / hour

Pre-Training with Whole Word Masking for Chinese BERT

19 Jun 2019ymcui/Chinese-BERT-wwm

In this technical report, we adapt whole word masking in Chinese text, that masking the whole word instead of masking Chinese characters, which could bring another challenge in Masked Language Model (MLM) pre-training task.

LANGUAGE MODELLING MACHINE READING COMPREHENSION NAMED ENTITY RECOGNITION NATURAL LANGUAGE INFERENCE SENTIMENT ANALYSIS

482
0.83 stars / hour

Cross-lingual Language Model Pretraining

22 Jan 2019facebookresearch/XLM

On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.

LANGUAGE MODELLING UNSUPERVISED MACHINE TRANSLATION

992
0.78 stars / hour

XNLI: Evaluating Cross-lingual Sentence Representations

EMNLP 2018 facebookresearch/XLM

State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models.

CROSS-LINGUAL NATURAL LANGUAGE INFERENCE MACHINE TRANSLATION

992
0.78 stars / hour