Open Graph Benchmark: Datasets for Machine Learning on Graphs

We present the Open Graph Benchmark (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs. For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics. In addition to building the datasets, we also perform extensive benchmark experiments for each dataset. Our experiments suggest that OGB datasets present significant challenges of scalability to large-scale graphs and out-of-distribution generalization under realistic data splits, indicating fruitful opportunities for future research. Finally, OGB provides an automated end-to-end graph ML pipeline that simplifies and standardizes the process of graph data loading, experimental setup, and model evaluation. OGB will be regularly updated and welcomes inputs from the community. OGB datasets as well as data loaders, evaluation scripts, baseline code, and leaderboards are publicly available at https://ogb.stanford.edu .

PDF Abstract NeurIPS 2020 PDF NeurIPS 2020 Abstract

Datasets


Introduced in the Paper:

OGB Open Graph Benchmark

Used in the Paper:

FB15k PPI COLLAB
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Link Property Prediction ogbl-citation2 Matrix Factorization Test MRR 0.5186 ± 0.0443 # 19
Validation MRR 0.5181 ± 0.0436 # 18
Number of params 281113505 # 4
Ext. data No # 1
Link Property Prediction ogbl-collab Matrix Factorization Test Hits@50 0.3886 ± 0.0029 # 30
Validation Hits@50 0.4896 ± 0.0029 # 26
Number of params 60514049 # 3
Ext. data No # 1
Link Property Prediction ogbl-ddi Matrix Factorization Test Hits@20 0.1368 ± 0.0475 # 29
Validation Hits@20 0.3370 ± 0.0264 # 24
Number of params 1224193 # 24
Ext. data No # 1
Link Property Prediction ogbl-ppa Matrix Factorization Test Hits@100 0.3229 ± 0.0094 # 18
Validation Hits@100 0.3228 ± 0.0428 # 18
Number of params 147662849 # 3
Ext. data No # 1
Node Property Prediction ogbn-arxiv MLP Test Accuracy 0.5550 ± 0.0023 # 78
Validation Accuracy 0.5765 ± 0.0012 # 76
Number of params 110120 # 65
Ext. data No # 1
Node Property Prediction ogbn-mag MLP Test Accuracy 0.2692 ± 0.0026 # 37
Validation Accuracy 0.2626 ± 0.0016 # 37
Number of params 188509 # 37
Ext. data No # 1
Node Property Prediction ogbn-papers100M MLP Test Accuracy 0.4724 ± 0.0031 # 20
Validation Accuracy 0.4960 ± 0.0029 # 20
Number of params 144044 # 18
Ext. data No # 1
Node Property Prediction ogbn-products MLP Test Accuracy 0.6106 ± 0.0008 # 61
Validation Accuracy 0.7554 ± 0.0014 # 57
Number of params 103727 # 52
Ext. data No # 1
Node Property Prediction ogbn-proteins MLP Test ROC-AUC 0.7204 ± 0.0048 # 22
Validation ROC-AUC 0.7706 ± 0.0014 # 20
Number of params 96880 # 21
Ext. data No # 1

Methods


No methods listed for this paper. Add relevant methods here