×

zbMATH — the first resource for mathematics

Universally consistent vertex classification for latent positions graphs. (English) Zbl 1273.62147
Summary: In this work we show that, using the eigen-decomposition of the adjacency matrix, we can consistently estimate feature maps for latent position graphs with positive definite link function \(\kappa\), provided that the latent positions are i.i.d. from some distribution \(F\). We then consider the exploitation task of vertex classification where the link function \(\kappa\) belongs to the class of universal kernels and class labels are observed for a number of vertices tending to infinity and that the remaining vertices are to be classified. We show that minimization of the empirical \(\varphi\)-risk for some convex surrogate \(\varphi\) of 0-1 loss over a class of linear classifiers with increasing complexities yields a universally consistent classifier, that is, a classification rule with error converging to Bayes optimal for any distribution \(F\).

MSC:
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62C99 Statistical decision theory
62C12 Empirical decision procedures; empirical Bayes procedures
62G20 Asymptotic properties of nonparametric inference
PDF BibTeX XML Cite
Full Text: DOI Euclid arXiv
References:
[1] Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981-2014. · Zbl 1225.68143 · www.jmlr.org
[2] Bartlett, P. L., Jordan, M. I. and McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. J. Amer. Statist. Assoc. 101 138-156. · Zbl 1118.62330 · doi:10.1198/016214505000000907 · miranda.asa.catchword.org
[3] Belkin, M. and Niyogi, P. (2005). Towards a theoretical foundation for Laplacian-based manifold methods. In Proceedings of the 18 th Conference on Learning Theory ( COLT 2005) 486-500. Springer, Berlin. · Zbl 1137.68521 · doi:10.1007/b137542
[4] Bengio, Y., Vincent, P., Paiement, J. F., Delalleau, O., Ouimet, M. and Roux, N. L. (2004). Learning eigenfunctions links spectral embedding and kernel PCA. Neural Comput. 16 2197-2219. · Zbl 1085.68120
[5] Biau, G., Bunea, F. and Wegkamp, M. H. (2005). Functional classification in Hilbert spaces. IEEE Trans. Inform. Theory 51 2163-2172. · Zbl 1285.94015 · doi:10.1109/TIT.2005.847705
[6] Blanchard, G., Bousquet, O. and Massart, P. (2008). Statistical performance of support vector machines. Ann. Statist. 36 489-531. · Zbl 1133.62044 · doi:10.1214/009053607000000839
[7] Blanchard, G., Bousquet, O. and Zwald, L. (2007). Statistical properties of kernel principal component analysis. Machine Learning 66 259-294. · Zbl 1078.68133
[8] Bollobás, B., Janson, S. and Riordan, O. (2007). The phase transition in inhomogeneous random graphs. Random Structures Algorithms 31 3-122. · Zbl 1123.05083 · doi:10.1002/rsa.20168
[9] Braun, M. L. (2006). Accurate error bounds for the eigenvalues of the kernel matrix. J. Mach. Learn. Res. 7 2303-2328. · Zbl 1222.62064 · www.jmlr.org
[10] Chatterjee, S. (2012). Matrix estimation by universal singular value thresholding. Preprint. Available at . · Zbl 1308.62038 · arxiv.org
[11] Chaudhuri, K., Chung, F. and Tsiatas, A. (2012). Spectral partitioning of graphs with general degrees and the extended planted partition model. In Proceedings of the 25 th Conference on Learning Theory . Springer, Berlin.
[12] Cucker, F. and Smale, S. (2002). On the mathematical foundations of learning. Bull. Amer. Math. Soc. ( N.S. ) 39 1-49 (electronic). · Zbl 0983.68162 · doi:10.1090/S0273-0979-01-00923-5
[13] Davis, C. and Kahan, W. M. (1970). The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7 1-46. · Zbl 0198.47201 · doi:10.1137/0707001
[14] Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics ( New York ) 31 . Springer, New York. · Zbl 0853.68150
[15] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33-61. · Zbl 1162.60009
[16] Fishkind, D. E., Sussman, D. L., Tang, M., Vogelstein, J. T. and Priebe, C. E. (2013). Consistent adjacency-spectral partitioning for the stochastic block model when the model parameters are unknown. SIAM J. Matrix Anal. Appl. 34 23-39. · Zbl 1314.05186 · doi:10.1137/120875600
[17] Hein, M., Audibert, J.-Y. and von Luxburg, U. (2005). From graphs to manifolds-Weak and strong pointwise consistency of graph Laplacians. In Proceedings of the 18 th Conference on Learning Theory ( COLT 2005) 470-485. Springer, Berlin. · Zbl 1095.68097 · doi:10.1007/b137542
[18] Hein, M., Audibert, J. Y. and von Luxburg, U. (2007). Convergence of graph Laplacians on random neighbourhood graphs. J. Mach. Learn. Res. 8 1325-1370. · Zbl 1222.68215
[19] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090-1098. · Zbl 1041.62098 · doi:10.1198/016214502388618906
[20] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: First steps. Social Networks 5 109-137. · doi:10.1016/0378-8733(83)90021-7
[21] Karrer, B. and Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E (3) 83 016107, 10. · doi:10.1103/PhysRevE.83.016107
[22] Kato, T. (1987). Variation of discrete spectra. Comm. Math. Phys. 111 501-504. · Zbl 0632.47002 · doi:10.1007/BF01238911
[23] Koltchinskii, V. and Giné, E. (2000). Random matrix approximation of spectra of integral operators. Bernoulli 6 113-167. · Zbl 0949.60078 · doi:10.2307/3318636
[24] Lugosi, G. and Vayatis, N. (2004). On the Bayes-risk consistency of regularized boosting methods. Ann. Statist. 32 30-55. · Zbl 1105.62319 · doi:10.1214/aos/1079120129 · euclid:aos/1079120129
[25] Micchelli, C. A., Xu, Y. and Zhang, H. (2006). Universal kernels. J. Mach. Learn. Res. 7 2651-2667. · Zbl 1222.68266 · www.jmlr.org
[26] Oliveira, R. I. (2010). Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. Preprint. Available at . · arxiv.org
[27] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39 1878-1915. · Zbl 1227.62042 · doi:10.1214/11-AOS887
[28] Rosasco, L., Belkin, M. and De Vito, E. (2010). On learning with integral operators. J. Mach. Learn. Res. 11 905-934. · Zbl 1242.62059
[29] Shawe-Taylor, J., Williams, C. K. I., Cristianini, N. and Kandola, J. (2005). On the eigenspectrum of the Gram matrix and the generalization error of kernel-PCA. IEEE Trans. Inform. Theory 51 2510-2522. · Zbl 1310.15076 · doi:10.1109/TIT.2005.850052
[30] Sriperumbudur, B. K., Fukumizu, K. and Lanckriet, G. R. G. (2011). Universality, characteristic kernels and RKHS embedding of measures. J. Mach. Learn. Res. 12 2389-2410. · Zbl 1280.68198
[31] Steinwart, I. (2002). On the influence of the kernel on the consistency of support vector machines. J. Mach. Learn. Res. 2 67-93. · Zbl 1009.68143 · doi:10.1162/153244302760185252
[32] Steinwart, I. (2002). Support vector machines are universally consistent. J. Complexity 18 768-791. · Zbl 1030.68074 · doi:10.1006/jcom.2002.0642
[33] Steinwart, I. and Scovel, C. (2012). Mercer’s theorem on general domains: On the interaction between measures, kernels, and RKHSs. Constr. Approx. 35 363-417. · Zbl 1252.46018 · doi:10.1007/s00365-012-9153-3
[34] Stone, C. J. (1977). Consistent nonparametric regression. Ann. Statist. 5 595-620. · Zbl 0366.62051 · doi:10.1214/aos/1176343886
[35] Sussman, D. L., Tang, M., Fishkind, D. E. and Priebe, C. E. (2012). A consistent adjacency spectral embedding for stochastic blockmodel graphs. J. Amer. Statist. Assoc. 107 1119-1128. · Zbl 1443.62188 · doi:10.1080/01621459.2012.699795
[36] Sussman, D. L., Tang, M. and Priebe, C. E. (2012). Consistent latent position estimation and vertex classification for random dot product graphs. Preprint. Available at . · arxiv.org
[37] Tropp, J. A. (2011). Freedman’s inequality for matrix martingales. Electron. Commun. Probab. 16 262-270. · Zbl 1225.60017 · doi:10.1214/ECP.v16-1624 · emis:journals/EJP-ECP/_ejpecp/ECP/viewarticle77e0.html
[38] von Luxburg, U., Belkin, M. and Bousquet, O. (2008). Consistency of spectral clustering. Ann. Statist. 36 555-586. · Zbl 1133.62045 · doi:10.1214/009053607000000640
[39] Young, S. J. and Scheinerman, E. R. (2007). Random dot product graph models for social networks. In Algorithms and Models for the Web-Graph. Lecture Notes in Computer Science 4863 138-149. Springer, Berlin. · Zbl 1136.05322 · doi:10.1007/978-3-540-77004-6_11
[40] Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist. 32 56-85. · Zbl 1105.62323 · doi:10.1214/aos/1079120130 · euclid:aos/1079120130
[41] Zwald, L. and Blanchard, G. (2006). On the convergence of eigenspaces in kernel principal components analysis. In Advances in Neural Information Processing Systems ( NIPS 05) 18 1649-1656. MIT Press, Cambridge.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.