New Benchmarks for Learning on Non-Homophilous Graphs

3 Apr 2021  ·  Derek Lim, Xiuyu Li, Felix Hohne, Ser-Nam Lim ·

Much data with graph structures satisfy the principle of homophily, meaning that connected nodes tend to be similar with respect to a specific attribute. As such, ubiquitous datasets for graph machine learning tasks have generally been highly homophilous, rewarding methods that leverage homophily as an inductive bias. Recent work has pointed out this particular focus, as new non-homophilous datasets have been introduced and graph representation learning models better suited for low-homophily settings have been developed. However, these datasets are small and poorly suited to truly testing the effectiveness of new methods in non-homophilous settings. We present a series of improved graph datasets with node label relationships that do not satisfy the homophily principle. Along with this, we introduce a new measure of the presence or absence of homophily that is better suited than existing measures in different regimes. We benchmark a range of simple methods and graph neural networks across our proposed datasets, drawing new insights for further research. Data and codes can be found at https://github.com/CUAI/Non-Homophily-Benchmarks.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe GCN+JK 1:1 Accuracy 60.99±0.14 # 23
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe LINK 1:1 Accuracy 57.71±0.36 # 26
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe L Prop (1hop) 1:1 Accuracy 56.50±0.41 # 28
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe LProp (2hop) 1:1 Accuracy 56.96±0.26 # 27
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe MLP-2 1:1 Accuracy 66.55±0.72 # 14
Node Classification on Non-Homophilic (Heterophilic) Graphs Deezer-Europe GAT+JK 1:1 Accuracy 59.66±0.92 # 25
Node Classification genius LINK  Accuracy 73.56 ± 0.14 # 21
Node Classification genius L Prop 1-hop Accuracy 66.02 ± 0.16 # 23
Node Classification genius L Prop 2-hop Accuracy 67.04 ± 0.20 # 22
Node Classification genius GATJK Accuracy 56.70 ± 2.07 # 24
Node Classification Penn94 GATJK Accuracy 80.69 ± 0.36 # 18
Node Classification Penn94 L Prop 1-hop Accuracy 63.21 ± 0.39 # 27
Node Classification Penn94 MLP Accuracy 73.61 ± 0.40 # 25
Node Classification Penn94 L Prop 2-hop Accuracy 74.13 ± 0.46 # 24
Node Classification Penn94 LINK  Accuracy 80.79 ± 0.49 # 17
Node Classification Penn94 GCNJK Accuracy 81.63 ± 0.54 # 13
Node Classification on Non-Homophilic (Heterophilic) Graphs Wisconsin(60%/20%/20% random splits) MLP-2 1:1 Accuracy 93.87 ± 3.33 # 14
Node Classification Yelp-Fraud GAT+JK AUC-ROC 90.04 # 4
Fraud Detection Yelp-Fraud GAT+JK AUC-ROC 90.04 # 4

Methods


No methods listed for this paper. Add relevant methods here