Graph-Bert: Only Attention is Needed for Learning Graph Representations

15 Jan 2020  ·  Jiawei Zhang, Haopeng Zhang, Congying Xia, Li Sun ·

The dominant graph neural networks (GNNs) over-rely on the graph links, several serious performance problems with which have been witnessed already, e.g., suspended animation problem and over-smoothing problem. What's more, the inherently inter-connected nature precludes parallelization within the graph, which becomes critical for large-sized graph, as memory constraints limit batching across the nodes. In this paper, we will introduce a new graph neural network, namely GRAPH-BERT (Graph based BERT), solely based on the attention mechanism without any graph convolution or aggregation operators. Instead of feeding GRAPH-BERT with the complete large input graph, we propose to train GRAPH-BERT with sampled linkless subgraphs within their local contexts. GRAPH-BERT can be learned effectively in a standalone mode. Meanwhile, a pre-trained GRAPH-BERT can also be transferred to other application tasks directly or with necessary fine-tuning if any supervised label information or certain application oriented objective is available. We have tested the effectiveness of GRAPH-BERT on several graph benchmark datasets. Based the pre-trained GRAPH-BERT with the node attribute reconstruction and structure recovery tasks, we further fine-tune GRAPH-BERT on node classification and graph clustering tasks specifically. The experimental results have demonstrated that GRAPH-BERT can out-perform the existing GNNs in both the learning effectiveness and efficiency.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Node Classification Citeseer Graph-Bert Accuracy 71.2% # 57
Node Classification Cora Graph-Bert Accuracy 84.3% # 31
Node Classification Pubmed Graph-Bert Accuracy 79.3% # 48

Methods