Distributed Representation of Subgraphs

22 Feb 2017  ·  Bijaya Adhikari, Yao Zhang, Naren Ramakrishnan, B. Aditya Prakash ·

Network embeddings have become very popular in learning effective feature representations of networks. Motivated by the recent successes of embeddings in natural language processing, researchers have tried to find network embeddings in order to exploit machine learning algorithms for mining tasks like node classification and edge prediction. However, most of the work focuses on finding distributed representations of nodes, which are inherently ill-suited to tasks such as community detection which are intuitively dependent on subgraphs. Here, we propose sub2vec, an unsupervised scalable algorithm to learn feature representations of arbitrary subgraphs. We provide means to characterize similarties between subgraphs and provide theoretical analysis of sub2vec and demonstrate that it preserves the so-called local proximity. We also highlight the usability of sub2vec by leveraging it for network mining tasks, like community detection. We show that sub2vec gets significant gains over state-of-the-art methods and node-embedding methods. In particular, sub2vec offers an approach to generate a richer vocabulary of features of subgraphs to support representation and reasoning.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Malware Clustering Android Malware Dataset sub2vec ARI 14.55 # 4
Malware Detection Android Malware Dataset sub2vec Accuracy 76.83 # 4

Methods


No methods listed for this paper. Add relevant methods here