Thus graph sampling is essential. The natural questions to ask are (a) which sampling method to use, (b) how small can the sample size be, and (c) how to scale up the measurements of the sample (e. g., the diameter), to get estimates for the large graph.
Studying real-world networks such as social networks or web networks is a challenge.
We provide a new graph generator, based on a "forest fire" spreading process, that has a simple, intuitive justification, requires very few parameters (like the "flammability" of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study.
While data mining in chemoinformatics studied graph data with dozens of nodes, systems biology and the Internet are now generating graph data with thousands and millions of nodes.
Our goal in this paper is to obtain a representative (unbiased) sample of Facebook users by crawling its social graph.
In this paper, we propose to use the concept of shortest path for sampling social networks.
In this paper, we develop methods to “sample” a small realistic graph from a large real network.
Network sampling is integral to the analysis of social, information, and biological networks.
Random walk is widely applied to sample large-scale graphs due to its simplicity of implementation and solid theoretical foundations of bias analysis.
We show that the proposed sampling method, which we call Frontier sampling, exhibits all of the nice sampling properties of a regular random walk.
Data Structures and Algorithms Networking and Internet Architecture G.3