Efficient Ensembles of Graph Neural Networks

29 Sep 2021 · Amrit Nagarajan, Jacob R. Stevens, Anand Raghunathan ·

Graph Neural Networks (GNNs) have enabled the power of deep learning to be applied to inputs beyond the Euclidean domain, with applications ranging from social networks and product recommendation engines to the life sciences. GNNs, like other classes of machine learning models, benefit from ensemble learning, wherein multiple models are combined to provide higher accuracy and robustness than single models. However, ensembles suffer from significantly higher inference processing and storage requirements, limiting their use in practical applications. In this work, we leverage the unique characteristics of GNNs to overcome these overheads, creating efficient ensemble GNNs that are faster than even single models at inference time. We observe that during message passing, nodes that are incorrectly classified (error nodes) also end up adversely affecting the representations of other nodes in their neighborhood. This error propagation also makes GNNs more difficult to approximate (e.g., through pruning) for efficient inference. We propose a technique to create ensembles of diverse models, and further propose Error Node Isolation (ENI), which prevents error nodes from sending messages to (and thereby influencing) other nodes. In addition to improving accuracy, ENI also leads to a significant reduction in the memory footprint and the number of arithmetic operations required to evaluate the computational graphs of all neighbors of error nodes. Remarkably, these savings outweigh even the overheads of using multiple models in the ensemble. A second key benefit of ENI is that it enhances the resilience of GNNs to approximations. Consequently, we propose Edge Pruning and Network Pruning techniques that target both the input graph and the neural networks used to process the graph. Our experiments on GNNs for transductive and inductive node classification demonstrate that ensembles with ENI are simultaneously more accurate (by up to 4.6% and 3.8%) and faster (by up to 2.8$\times$ and 5.7$\times$) when compared to the best-performing single models and ensembles without ENI, respectively. In addition, GNN ensembles with ENI are consistently more accurate than single models and ensembles without ENI when subject to pruning, leading to additional speedups of up to 5$\times$ with no loss in accuracy.

PDF Abstract