×

Subsampling bootstrap of count features of networks. (English) Zbl 1326.62067

Summary: Analysis of stochastic models of networks is quite important in light of the huge influx of network data in social, information and bio sciences, but a proper statistical analysis of features of different stochastic models of networks is still underway. We propose bootstrap subsampling methods for finding empirical distribution of count features or “moments” [the second author et al., Ann. Stat. 39, No. 5, 2280–2301 (2011; Zbl 1232.91577)] and smooth functions of these features for the networks. Using these methods, we cannot only estimate the variance of count features but also get good estimates of such feature counts, which are usually expensive to compute numerically in large networks. In our paper, we prove theoretical properties of the bootstrap estimates of variance of the count features as well as show their efficacy through simulation. We also use the method on some real network data for estimation of variance and expectation of some count features.

MSC:

62F40 Bootstrap, jackknife and other resampling methods
62G09 Nonparametric statistical resampling methods
62D05 Sampling theory, sample surveys

Citations:

Zbl 1232.91577

References:

[1] Aldous, D. J. (1981). Representations for partially exchangeable arrays of random variables. J. Multivariate Anal. 11 581-598. · Zbl 0474.60044 · doi:10.1016/0047-259X(81)90099-3
[2] Barabási, A.-L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286 509-512. · Zbl 1226.05223 · doi:10.1126/science.286.5439.509
[3] Bearman, P. S., Moody, J. and Stovel, K. (2004). Chains of affection: The structure of adolescent romantic and sexual Networks1. American Journal of Sociology 110 44-91.
[4] Bhattacharyya, S. and Bickel, P. J. (2015). Supplement to “Subsampling bootstrap of count features of networks.” . · Zbl 1326.62067 · doi:10.1214/15-AOS1338
[5] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities. Proc. Natl. Acad. Sci. USA 106 21068-21073. · Zbl 1359.62411 · doi:10.1073/pnas.0907096106
[6] Bickel, P. J., Chen, A. and Levina, E. (2011). The method of moments and degree distributions for network models. Ann. Statist. 39 2280-2301. · Zbl 1232.91577 · doi:10.1214/11-AOS904
[7] Bickel, P. J., Götze, F. and van Zwet, W. R. (1997). Resampling fewer than \(n\) observations: Gains, losses, and remedies for losses. Statist. Sinica 7 1-31. · Zbl 0927.62043
[8] Bollobás, B., Janson, S. and Riordan, O. (2007). The phase transition in inhomogeneous random graphs. Random Structures Algorithms 31 3-122. · Zbl 1123.05083 · doi:10.1002/rsa.20168
[9] Chung, F. and Lu, L. (2002). Connected components in random graphs with given expected degree sequences. Ann. Comb. 6 125-145. · Zbl 1009.05124 · doi:10.1007/PL00012580
[10] Davis, J. A. and Leinhardt, S. (1972). The structure of positive interpersonal relations in small groups. In Sociological Theories in Progress , Vol. 2 (J. Berger, M. Zelditch and B. Anderson, eds.) 218-251. Houghton-Mifflin, New York.
[11] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33-61. · Zbl 1162.60009
[12] Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist. 7 1-26. · Zbl 0406.62024 · doi:10.1214/aos/1176344552
[13] Frank, O. (2005). Models and methods in social network analysis. In Network sampling and model fitting (P. J. Carrington, J. Scott and S. S. Wasserman, eds.) 31-56. Cambridge Univ. Press, Cambridge.
[14] Handcock, M. S. and Gile, K. J. (2010). Modeling social networks from sampled data. Ann. Appl. Stat. 4 5-25. · Zbl 1189.62187 · doi:10.1214/08-AOAS221
[15] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090-1098. · Zbl 1041.62098 · doi:10.1198/016214502388618906
[16] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks 5 109-137. · doi:10.1016/0378-8733(83)90021-7
[17] Hoover, D. N. (1979). Relations on probability spaces and arrays of random variables. Institute for Advanced Study, Princeton, NJ.
[18] Kallenberg, O. (2005). Probabilistic Symmetries and Invariance Principles . Springer, New York. · Zbl 1084.60003 · doi:10.1007/0-387-28861-9
[19] Kolaczyk, E. D. (2009). Statistical Analysis of Network Data : Methods and Models . Springer, New York. · Zbl 1277.62021 · doi:10.1007/978-0-387-88146-1
[20] Leskovec, J., Kleinberg, J. and Faloutsos, C. (2005). Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining 177-187. ACM, New York.
[21] Lovász, L. (2012). Large Networks and Graph Limits. American Mathematical Society Colloquium Publications 60 . Amer. Math. Soc., Providence, RI. · Zbl 1292.05001
[22] Middendorf, M., Ziv, E. and Wiggins, C. H. (2005). Inferring network mechanisms: The Drosophila melanogaster protein interaction network. Proc. Natl. Acad. Sci. USA 102 3192-3197.
[23] Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M. and Alon, U. (2004). Superfamilies of evolved and designed networks. Science 303 1538-1542.
[24] Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D. and Alon, U. (2002). Network motifs: Simple building blocks of complex networks. Science 298 824-827.
[25] Nowicki, K. and Snijders, T. A. B. (2001). Estimation and prediction for stochastic blockstructures. J. Amer. Statist. Assoc. 96 1077-1087. · Zbl 1072.62542 · doi:10.1198/016214501753208735
[26] Picard, F., Daudin, J.-J., Koskas, M., Schbath, S. and Robin, S. (2008). Assessing the exceptionality of network motifs. J. Comput. Biol. 15 1-20. · doi:10.1089/cmb.2007.0137
[27] Przytycka, T. M. (2006). An important connection between network motifs and parsimony models. In Research in Computational Molecular Biology 321-335. Springer, Berlin. · Zbl 1302.92044 · doi:10.1007/11732990_27
[28] Shen-Orr, S. S., Milo, R., Mangan, S. and Alon, U. (2002). Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. 31 64-68.
[29] Thompson, S. K. (2012). Sampling , 3rd ed. Wiley, Hoboken, NJ. · Zbl 1267.62020 · doi:10.1002/9781118162934
[30] Thompson, S. K. and Frank, O. (2000). Model-based estimation with link-tracing sampling designs. Survey Methodology 26 87-98.
[31] Traud, A. L., Kelsic, E. D., Mucha, P. J. and Porter, M. A. (2011). Comparing community structure to characteristics in online collegiate social networks. SIAM Rev. 53 526-543. · doi:10.1137/080734315
[32] Wasserman, S. and Faust, K. (1994). Social Network Analysis : Methods and Applications 8 . Cambridge University Press, Cambridge. · Zbl 0926.91066
[33] Wernicke, S. (2006). Efficient detection of network motifs. IEEE/ACM Transactions on Computational Biology and Bioinformatics 3 347-359.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.