Xu, Min; Jog, Varun; Loh, Po-Ling Optimal rates for community estimation in the weighted stochastic block model. (English) Zbl 1440.62126 Ann. Stat. 48, No. 1, 183-204 (2020). An important property of stochastic block methods (SBM) is that all edges are assumed to be binary. In contrast, the edges appearing in many real-world networks are weighted. It justified to study weighted SBM. Each edge is generated from a Bernoulli(\(p\)) or Bernoulli(\(q\)) distribution depending on whether its endpoints lie in the same community. The main theoretical contribution is to characterize the optimal rate of misclustering error in the weighted SBM. The results show that the optimal rate for community estimator in a weighted SBM is governed by the Renyi-divergence of order \(\frac{1}{2}\) between two mixed distributions capturing the discrepancy between the edge probabilities and edge weight densities for between-community and within-community connections. Reviewer: Rózsa Horváth-Bokor (Budakalász) Cited in 11 Documents MSC: 62G07 Density estimation 62H30 Classification and discrimination; cluster analysis (statistical aspects) 91D30 Social networks; opinion dynamics 62C20 Minimax procedures in statistical decision theory 90B15 Stochastic network models in operations research Keywords:nonparametric estimation; network analysis; optimal estimation rates; Renyi divergence; stochastic block models Software:Brain Connectivity Toolbox; STRUCTURE × Cite Format Result Cite Review PDF Full Text: DOI arXiv Euclid References: [1] Abbe, E. (2017). Community detection and stochastic block models: Recent developments. J. Mach. Learn. Res. 18 Paper No. 177, 86. · Zbl 1403.62110 [2] Abbe, E., Bandeira, A. S., Bracher, A. and Singer, A. (2014). Decoding binary node labels from censored edge measurements: Phase transition and efficient recovery. IEEE Trans. Netw. Sci. Eng. 1 10-22. [3] Abbe, E., Bandeira, A. S. and Hall, G. (2016). Exact recovery in the stochastic block model. IEEE Trans. Inform. Theory 62 471-487. · Zbl 1359.94047 · doi:10.1109/TIT.2015.2490670 [4] Abbe, E. and Sandon, C. (2015). Community detection in general stochastic block models: Fundamental limits and efficient algorithms for recovery. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science—FOCS 2015 670-688. IEEE Computer Soc., Los Alamitos, CA. [5] Abbe, E. and Sandon, C. (2015). Recovering communities in the general stochastic block model without knowing the parameters. In Advances in Neural Information Processing Systems 676-684. [6] Aicher, C., Jacobs, A. Z. and Clauset, A. (2015). Learning latent block structure in weighted networks. J. Complex Netw. 3 221-248. · Zbl 1397.68151 · doi:10.1093/comnet/cnu026 [7] Balakrishnan, S., Xu, M., Krishnamurthy, A. and Singh, A. (2011). Noise thresholds for spectral clustering. In Advances in Neural Information Processing Systems 954-962. [8] Barrat, A., Barthelemy, M., Pastor-Satorras, R. and Vespignani, A. (2004). The architecture of complex weighted networks. Proc. Natl. Acad. Sci. USA 101 3747-3752. [9] Blondel, V. D., Guillaume, J.-L., Lambiotte, R. and Lefebvre, E. (2008). Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 10. · Zbl 1459.91130 · doi:10.1088/1742-5468/2008/10/P10008 [10] Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. and Hwang, D.-U. (2006). Complex networks: Structure and dynamics. Phys. Rep. 424 175-308. · Zbl 1371.82002 · doi:10.1016/j.physrep.2005.10.009 [11] Chin, P., Rao, A. and Vu, V. (2015). Stochastic block model and community detection in sparse graphs: A spectral algorithm with optimal rate of recovery. In Proceedings of the 28th Conference on Learning Theory 391-423. [12] Decelle, A., Krzakala, F., Moore, C. and Zdeborová, L. (2011). Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84. [13] Easley, D. and Kleinberg, J. (2010). Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge Univ. Press, Cambridge. · Zbl 1205.91007 [14] Fienberg, S. E., Meyer, M. M. and Wasserman, S. S. (1985). Statistical analysis of multiple sociometric relations. J. Amer. Statist. Assoc. 80 51-67. [15] Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2017). Achieving optimal misclassification proportion in stochastic block models. J. Mach. Learn. Res. 18 Paper No. 60, 45. · Zbl 1440.62244 [16] Goldenberg, A., Zheng, A. X., Fienberg, S. E. and Airoldi, E. M. (2010). A survey of statistical network models. Found. Trends Mach. Learn. 2 129-233. · Zbl 1184.68030 · doi:10.1561/2200000005 [17] Hajek, B., Wu, Y. and Xu, J. (2016). Achieving exact cluster recovery threshold via semidefinite programming. IEEE Trans. Inform. Theory 62 2788-2797. · Zbl 1359.94222 · doi:10.1109/TIT.2016.2546280 [18] Hajek, B., Wu, Y. and Xu, J. (2016). Achieving exact cluster recovery threshold via semidefinite programming: Extensions. IEEE Trans. Inform. Theory 62 5918-5937. · Zbl 1359.94951 · doi:10.1109/TIT.2016.2594812 [19] Hajek, B., Wu, Y. and Xu, J. (2017). Information limits for recovering a hidden community. IEEE Trans. Inform. Theory 63 4729-4745. · Zbl 1372.94364 · doi:10.1109/TIT.2017.2653804 [20] Hajek, B., Wu, Y. and Xu, J. (2017). Submatrix localization via message passing. J. Mach. Learn. Res. 18 Paper No. 186, 52. · Zbl 1468.68155 [21] Hartuv, E. and Shamir, R. (2000). A clustering algorithm based on graph connectivity. Inform. Process. Lett. 76 175-181. · Zbl 0996.68525 · doi:10.1016/S0020-0190(00)00142-3 [22] Heimlicher, S., Lelarge, M. and Massoulié, L. (2012). Community detection in the labelled stochastic block model. Preprint. Available at arXiv:1209.2910. [23] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw. 5 109-137. [24] Jackson, M. O. (2008). Social and Economic Networks. Princeton Univ. Press, Princeton, NJ. · Zbl 1149.91051 [25] Jog, V. and Loh, P. (2015). Information-theoretic bounds for exact recovery in weighted stochastic block models using the Renyi divergence. Preprint. Available at arXiv:1509.06418. [26] Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43 215-237. · Zbl 1308.62041 · doi:10.1214/14-AOS1274 [27] Lelarge, M., Massoulié, L. and Xu, J. (2015). Reconstruction in the labelled stochastic block model. IEEE Trans. Netw. Sci. Eng. 2 152-163. [28] Massoulié, L. (2014). Community detection thresholds and the weak Ramanujan property. In STOC’14—Proceedings of the 2014 ACM Symposium on Theory of Computing 694-703. ACM, New York. · Zbl 1315.68210 [29] McSherry, F. (2001). Spectral partitioning of random graphs. In 42nd IEEE Symposium on Foundations of Computer Science (las Vegas, NV, 2001) 529-537. IEEE Computer Soc., Los Alamitos, CA. [30] Mossel, E., Neeman, J. and Sly, A. (2012). Stochastic block models and reconstruction. Preprint. Available at arXiv:1202.1499. · Zbl 1350.05154 · doi:10.1214/15-AAP1145 [31] Mossel, E., Neeman, J. and Sly, A. (2014). Consistency thresholds for binary symmetric block models. Preprint. Available at arXiv:1407.1591. · Zbl 1321.05242 [32] Mossel, E., Neeman, J. and Sly, A. (2018). A proof of the block model threshold conjecture. Combinatorica 38 665-708. · Zbl 1424.05272 · doi:10.1007/s00493-016-3238-8 [33] Newman, M., Barabási, A.-L. and Watts, D. J., eds. (2006). The Structure and Dynamics of Networks. Princeton Studies in Complexity. Princeton Univ. Press, Princeton, NJ. · Zbl 1362.00042 [34] Newman, M. E. J. (2004). Analysis of weighted networks. Phys. Rev. E 70. [35] Newman, M. E. J. and Girvan, M. (2004). Finding and evaluating community structure in networks. Phys. Rev. E 69. [36] Pritchard, J. K., Stephens, M. and Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics 155 945-959. [37] Rubinov, M. and Sporns, O. (2010). Complex network measures of brain connectivity: Uses and interpretations. NeuroImage 52 1059-1069. [38] Sade, D. S. (1972). Sociometrics of Macaca mulatta: I. Linkages and cliques in grooming matrices. Folia Primatologica 18 196-223. [39] Shi, J. and Malik, J. (2000). Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 888-905. [40] Xu, M., Jog, V. and Loh, P.-L (2020). Supplement to “Optimal rates for community estimation in the weighted stochastic block model.” https://doi.org/10.1214/18-AOS1797SUPP. · Zbl 1440.62126 [41] Yun, S. and Proutiere, A. (2016). Optimal cluster recovery in the labeled stochastic block model. In Advances in Neural Information Processing Systems 965-973. [42] Zhang, A. Y. and Zhou, H. H. (2016). Minimax rates of community detection in stochastic block models. Ann. Statist. 44 2252-2280. · Zbl 1355.60125 · doi:10.1214/15-AOS1428 [43] Zhang, B. and Horvath, S. (2005). A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4 Art. 17, 45. · Zbl 1077.92042 · doi:10.2202/1544-6115.1128 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.