×

Classification and estimation in the stochastic blockmodel based on the empirical degrees. (English) Zbl 1295.62065

Summary: The Stochastic Blockmodel is a mixture model for heterogeneous network data. Unlike the usual statistical framework, new nodes give additional information about the previous ones in this model. Thereby the distribution of the degrees concentrates in points conditionally on the node class. We show under a mild assumption that classification, estimation and model selection can actually be achieved with no more than the empirical degree data. We provide an algorithm able to process very large networks and consistent estimators based on it. In particular, we prove a bound of the probability of misclassification of at least one node, including when the number of classes grows.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H12 Estimation in multivariate analysis

Software:

Mixnet
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] R. Albert and A.L. Barabási. Statistical mechanics of complex networks., Reviews of modern physics , 74(1):47, 2002. · Zbl 1205.82086
[2] C. Ambroise and C. Matias. New consistent and asymptotically normal parameter estimates for random-graph mixture models., Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 2011.
[3] P.J. Bickel and A. Chen. A nonparametric view of network models and Newman-Girvan and other modularities., Proceedings of the National Academy of Sciences , 106 (50) :21068, 2009. · Zbl 1359.62411
[4] P.J. Bickel, A. Chen, and E. Levina. The method of moments and degree distributions for network models., The Annals of Statistics , 39(5):38-59, 2011. · Zbl 1232.91577
[5] B. Bollobás, S. Janson, and O. Riordan. The phase transition in inhomogeneous random graphs., Random Structures & Algorithms , 31(1): 3-122, 2007. ISSN 1098-2418. · Zbl 1123.05083
[6] A. Celisse, J.-J. Daudin, and L. Pierre. Consistency of maximum-likelihood and variational estimators in the stochastic block model., Electron. J. Statist. , 6 :1847-1899, 2012. ISSN 1935-7524. 10.1214/12-EJS729 · Zbl 1295.62028
[7] D.S. Choi, P.J. Wolfe, and E.M. Airoldi. Stochastic blockmodels with growing number of classes., Arxiv preprint arXiv :1011.4644 , 2010.
[8] A. Condon and R.M. Karp. Algorithms for graph partitioning on the planted partition model., Random Structures and Algorithms , 18(2): 116-140, 2001. · Zbl 0972.68129
[9] J.J. Daudin, F. Picard, and S. Robin. A mixture model for random graphs., Statistics and computing , 18(2):173-183, 2008.
[10] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová. Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications., Physical Review E , 84(6) :066106, 2011.
[11] P. Erdős and A. Rényi. On random graphs, I., Publicationes Mathematicae (Debrecen) , 6:290-297, 1959. · Zbl 0092.15705
[12] K. Faust and S. Wasserman., Social network analysis: Methods and applications . Cambridge University Press, 1994. · Zbl 0926.91066
[13] S.E. Fienberg and S.S. Wasserman. Categorical data analysis of single sociometric relations., Sociological methodology , 12:156-192, 1981. ISSN 0081-1750.
[14] E.N. Gilbert. Random graphs., The Annals of Mathematical Statistics , 30 (4) :1141-1144, 1959. ISSN 0003-4851. · Zbl 0168.40801
[15] M. Girvan and M.E.J. Newman. Community structure in social and biological networks., Proceedings of the National Academy of Sciences of the United States of America , 99(12) :7821, 2002. · Zbl 1032.91716
[16] P.W. Holland, K.B. Laskey, and S. Leinhardt. Stochastic blockmodels: First steps., Social Networks , 5(2):109-137, 1983.
[17] P.W. Holland and S. Leinhardt. An exponential family of probability distributions for directed graphs., Journal of the American Statistical Association , 76 (373):33-50, 1981. ISSN 0162-1459. · Zbl 0457.62090
[18] T.S. Jaakkola. Tutorial on variational approximation methods., Advanced mean field methods: theory and practice , pages 129-159, 2000.
[19] A. Lancichinetti, S. Fortunato, and J. Kertész. Detecting the overlapping and hierarchical community structure in complex networks., New Journal of Physics , 11 :033015, 2009.
[20] F. Lorrain and H.C. White. Structural equivalence of individuals in social networks., The Journal of mathematical sociology , 1 (1):49-80, 1971.
[21] M.E.J. Newman. Modularity and community structure in networks., Proceedings of the National Academy of Sciences , 103 (23) :8577, 2006.
[22] K. Nowicki and T.A.B. Snijders. Estimation and prediction for stochastic blockstructures., Journal of the American Statistical Association , 96 (455) :1077-1087, 2001. · Zbl 1072.62542
[23] F. Picard, V. Miele, J.J. Daudin, L. Cottret, and S. Robin. Deciphering the connectivity structure of biological networks using MixNet., BMC bioinformatics , 10(Suppl 6):S17, 2009.
[24] K. Rohe, S. Chatterjee, and B. Yu. Spectral clustering and the high-dimensional Stochastic Block Model., Arxiv preprint arXiv :1007.1684 , 2010. · Zbl 1227.62042
[25] T.A.B. Snijders and K. Nowicki. Estimation and prediction for stochastic blockmodels for graphs with latent block structure., Journal of Classification , 14(1):75-100, 1997. · Zbl 0896.62063
[26] R. Van Der Hofstad. Random graphs and complex networks., Available on http://www.win.tue.nl/ rhofstad/NotesRGCN.pdf, 2009. · Zbl 1193.82017
[27] M.J. Wainwright and M.I. Jordan. Graphical models, exponential families, and variational inference., Foundations and Trends\textregistered in Machine Learning , 1(1-2):1-305, 2008. · Zbl 1193.62107
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.