×

Limit theorems for eigenvectors of the normalized Laplacian for random graphs. (English) Zbl 1408.62120

Distributional limit results are established for the eigenvectors corresponding to the largest eigenvalues of the adjacency or Laplacian matrix. It is shown that the Chernoff information between the multivariate normals approximation of the embedding characterizes the minimum error rate achievable by any clustering procedure that operates only on the spectral embedding. The contribution reinforces results concerning the prediction accuracy’s improvements using the normalization from [P. Sarkar and P. J. Bickel, Ann. Stat. 43, No. 3, 962–990 (2015; Zbl 1320.62150)], ref. also [U. von Luxburg, “A tutorial on spectral clustering”, Stat. Comput. 17, No. 4, 395–416 (2007; doi:10.1007/s11222-007-9033-z)].

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
05C80 Random graphs (graph-theoretic aspects)
62H99 Multivariate analysis
62M15 Inference from stochastic processes and spectral analysis

Citations:

Zbl 1320.62150

Software:

mclust
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] Ali, S. M. and Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. J. Roy. Statist. Soc. Ser. B28 131–142. · Zbl 0203.19902
[2] Amini, A., Chen, A., Bickel, P. and Levina, E. (2013). Pseudo-likelihood methods for community detection in large sparse networks. Ann. Statist.41 2097–2122. · Zbl 1277.62166
[3] Asta, D. and Shalizi, C. (2014). Geometric network comparison. arXiv preprint at arXiv:1411.1350.
[4] Athreya, A., Lyzinski, V., Marchette, D. J., Priebe, C. E., Sussman, D. L. and Tang, M. (2016). A limit theorem for scaled eigenvectors of random dot product graphs. Sankhya A78 1–18. · Zbl 1338.62061
[5] Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput.15 1373–1396. · Zbl 1085.68119
[6] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman–Girvan and other modularities. Proc. Natl. Acad. Sci. USA106 21068–21073. · Zbl 1359.62411
[7] Bollobás, B., Janson, S. and Riordan, O. (2007). The phase transition in inhomogeneous random graphs. Random Structures Algorithms31 3–122. · Zbl 1123.05083
[8] Boucheron, S., Lugosi, G. and Massart, P. (2003). Concentration inequalities using the entropy method. Ann. Probab.31 1583–1614. · Zbl 1051.60020
[9] Chaudhuri, K., Chung, F. and Tsiatas, A. (2012). Spectral partitioning of graphs with general degrees and the extended planted partition model. In Proceedings of the 25th Conference on Learning Theory.
[10] Chernoff, H. (1952). A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Stat.23 493–507. · Zbl 0048.11804
[11] Chernoff, H. (1956). Large sample theory: Parametric case. Ann. Math. Stat.27 1–22. · Zbl 0072.35703
[12] Choi, D. S., Wolfe, P. J. and Airoldi, E. M. (2012). Stochastic blockmodels with a growing number of classes. Biometrika99 273–284. · Zbl 1318.62207
[13] Chung, F. R. K. (1997). Spectral Graph Theory92. Amer. Math. Soc.
[14] Coifman, R. and Lafon, S. (2006). Diffusion maps. Appl. Comput. Harmon. Anal.21 5–30. · Zbl 1095.68094
[15] Csizár, I. (1967). Information-type measures of difference of probability distributions and indirect observations. Studia Sci. Math. Hungar.2 229–318.
[16] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33–61. · Zbl 1162.60009
[17] Fraley, C. and Raftery, A. E. (1999). MCLUST: Software for model-based cluster analysis. J. Classification16 297–306. · Zbl 0951.91500
[18] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc.97 1090–1098. · Zbl 1041.62098
[19] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Soc. Netw.5 109–137.
[20] Joseph, A. and Yu, B. (2016). Impact of regularization on spectral clustering. Ann. Statist.44 1765–1791. · Zbl 1357.62229
[21] Karrer, B. and Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107.
[22] Leang, C. C. and Johnson, D. H. (1997). On the asymptotics of M-hypothesis Bayesian detection. IEEE Trans. Inform. Theory43 280–282. · Zbl 0868.94013
[23] Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist.43 215–237. · Zbl 1308.62041
[24] Liese, F. and Vajda, I. (2006). On divergences and informations in statistics and information theory. IEEE Trans. Inform. Theory52 4394–4412. · Zbl 1287.94025
[25] Lu, L. and Peng, X. (2013). Spectra of edge-independent random graphs. Electron. J. Combin.20. · Zbl 1295.05211
[26] Luxburg, U. V. (2007). A tutorial on spectral clustering. Stat. Comput.17 395–416.
[27] Lyzinski, V., Sussman, D. L., Tang, M., Athreya, A. and Priebe, C. E. (2014). Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding. Electron. J. Stat.8 2905–2922. · Zbl 1308.62131
[28] McSherry, F. (2001). Spectral partitioning of random graphs. In Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science 529–537.
[29] Merris, R. (1994). Laplacian matrices of graphs: A survey. Linear Algebra Appl.197 143–176. · Zbl 0802.05053
[30] Mossel, E., Neeman, J. and Sly, A. (2015). Reconstruction and estimation in the planted partition model. Probab. Theory Related Fields162 431–461. · Zbl 1320.05113
[31] Oliveira, R. I. (2009). Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. arXiv preprint at arXiv:0911.0600.
[32] Qin, T. and Rohe, K. (2013). Regularized spectral clustering under the degree-corrected stochastic blockmodel. In NIPS.
[33] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist.39 1878–1915. · Zbl 1227.62042
[34] Sarkar, P. and Bickel, P. J. (2015). Role of normalization for spectral clustering in stochastic blockmodels. Ann. Statist.43 962–990. · Zbl 1320.62150
[35] Shi, J. and Malik, J. (2000). Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell.22 888–905.
[36] Snijders, T. A. B. and Nowicki, K. (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. J. Classification14 75–100. · Zbl 0896.62063
[37] Stewart, G. W. and Sun, J. (1990). Matrix Pertubation Theory. Academic Press, New York.
[38] Sussman, D. L., Tang, M., Fishkind, D. E. and Priebe, C. E. (2012). A consistent adjacency spectral embedding for stochastic blockmodel graphs. J. Amer. Statist. Assoc.107 1119–1128. · Zbl 1443.62188
[39] Tang, M., Athreya, A., Sussman, D. L., Lyzinski, V., Park, Y. and Priebe, C. E. (2017). A semiparametric two-sample hypothesis testing problem for random dot product graphs. J. Comput. Graph. Statist.26 344–354.
[40] Tang, M., Athreya, A., Sussman, D. L., Lyzinski, V. and Priebe, C. E. (2017). A nonparametric two-sample hypothesis testing problem for random graphs. Bernoulli23 1599–1630. · Zbl 1450.62040
[41] von Luxburg, U., Belkin, M. and Bousquet, O. (2008). Consistency of spectral clustering. Ann. Statist.36 555–586. · Zbl 1133.62045
[42] Wolfe, P. J. and Olhede, S. C. (2013). Nonparametric graphon estimation. arXiv preprint at arXiv:1309.5936.
[43] Yang, J. J., Han, Q. and Airoldi, E. M. (2014). Nonparametric estimation and testing of exchangeable graph models.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.