Discriminative Graph Autoencoder

With the abundance of graph-structured data in various applications, graph representation learning has become an effective computational tool for seeking informative vector representations for graphs. Traditional graph kernel approaches are usually frequency-based. Each dimension of a learned vector representation for a graph is the frequency of a certain type of substructure. They encounter high computational cost for counting the occurrence of predefined substructures. The learned vector representations are very sparse, which prohibit the use of inner products. Moreover, the learned vector representations are not in a smooth space since the values can only be integers. The state-of-the-art approaches tackle the challenges by changing kernel functions instead of producing better vector representations. They can only produce kernel matrices for kernel-based methods and not compatible with methods requiring vector representations. Effectively learning smooth vector representations for graphs of various structures and sizes remains a challenging task. Motivated by the recent advances in deep autoencoders, in this paper, we explore the capability of autoencoder on learning representations for graphs. Unlike videos or images, the graphs are usually of various sizes and are not readily prepared for autoencoder. Therefore, a novel framework, namely discriminative graph autoencoder (DGA), is proposed to learn low-dimensional vector representations for graphs. The algorithm decomposes the large graphs into small subgraphs, from which the structural information is sampled. The DGA produces smooth and informative vector representations of graphs efficiently while preserving the discriminative information according to their labels. Extensive experiments have been conducted to evaluate DGA. The experimental results demonstrate the efficiency and effectiveness of DGA comparing with traditional and state-of-the-art approaches on various real-world datasets and applications, e.g.,...

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Graph Classification PROTEINS DGA Accuracy 77.71% # 23
Graph Classification PTC DGA Accuracy 71.24% # 12

Methods