On Batch-size Selection for Stochastic Training for Graph Neural Networks
In recent years deep learning has become an important framework for supervised learning. It has been observed that stochastic gradient decent (SGD) method in deep learning networks performs well when the minibatch size is small. In this work, we focus on the importance of batch size selection in Graph Neural Networks (GNN). We provide theoretical analysis based on an estimator that considers the randomness arising from two consecutive layers in GNN, and suggest a guideline for picking the appropriate scale of the batch size. We complement our theoretical results with empirical experiments. We consider the following baseline methods: ClusterGCN, FastGCN, GraphSaint on the following datasets: Ogbn-products, Ogbn-arxiv, Reddit and Pubmed. We demonstrate that in contrast to conventional deep learning models, GNNs benefit from large batch sizes.
PDF Abstract