From Cluster Assumption to Graph Convolution: Graph-based Semi-Supervised Learning Revisited

24 Sep 2023  ·  Zheng Wang, Hongming Ding, Li Pan, Jianhua Li, Zhiguo Gong, Philip S. Yu ·

Graph-based semi-supervised learning (GSSL) has long been a hot research topic. Traditional methods are generally shallow learners, based on the cluster assumption. Recently, graph convolutional networks (GCNs) have become the predominant techniques for their promising performance. In this paper, we theoretically discuss the relationship between these two types of methods in a unified optimization framework. One of the most intriguing findings is that, unlike traditional ones, typical GCNs may not jointly consider the graph structure and label information at each layer. Motivated by this, we further propose three simple but powerful graph convolution methods. The first is a supervised method OGC which guides the graph convolution process with labels. The others are two unsupervised methods: GGC and its multi-scale version GGCM, both aiming to preserve the graph structure information during the convolution process. Finally, we conduct extensive experiments to show the effectiveness of our methods.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Node Classification CiteSeer with Public Split: fixed 20 nodes per class GGCM Accuracy 74.2 # 9
Node Classification CiteSeer with Public Split: fixed 20 nodes per class OGC Accuracy 77.5 # 1
Node Classification Cora with Public Split: fixed 20 nodes per class GGCM Accuracy 83.6% # 14
Node Classification Cora with Public Split: fixed 20 nodes per class OGC Accuracy 86.9% # 1
Node Classification PubMed with Public Split: fixed 20 nodes per class OGC Accuracy 83.4% # 1
Node Classification PubMed with Public Split: fixed 20 nodes per class GGCM Accuracy 80.8% # 10

Methods