×

Spectral analysis of the Gram matrix of mixture models. (English) Zbl 1384.60022

Summary: This text is devoted to the asymptotic study of some spectral properties of the Gram matrix \(W^{\mathrm T}W\) built upon a collection \(w_{1}, \dots,w_{n} \in \mathbb R^{p}\) of random vectors (the columns of \(W\)), as both the number \(n\) of observations and the dimension \(p\) of the observations tend to infinity and are of similar order of magnitude. The random vectors \(w_{1}, \dots,w_{n}\) are independent observations, each of them belonging to one of \(k\) classes \(\mathcal C_1, \dots, \mathcal C_k\). The observations of each class \(\mathcal C_a\) (\(1 \leq a \leq k\)) are characterized by their distribution \(\mathcal N(0,p^{-1}C_a)\), where \(\mathcal C_1, \dots, \mathcal C_k\) are some non negative definite \(p \times p\) matrices. The cardinality \(n_{a}\) of class \(\mathcal C_a\) and the dimension \(p\) of the observations are such that \(n_{a}/n\) (\(1 \leq a \leq k\)) and \(p/n\) stay bounded away from 0 and \(+\infty\). We provide deterministic equivalents to the empirical spectral distribution of \(W^{\mathrm T}W\) and to the matrix entries of its resolvent (as well as of the resolvent of \(WW^{\mathrm T}\)). These deterministic equivalents are defined thanks to the solutions of a fixed-point system. Besides, we prove that \(W^{\mathrm T}W\) has asymptotically no eigenvalues outside the bulk of its spectrum, defined thanks to these deterministic equivalents. These results are directly used in our companion paper [Electron. J. Stat. 10, No. 1, 1393–1454 (2016; Zbl 1398.62160)], which is devoted to the analysis of the spectral clustering algorithm in large dimensions. They also find applications in various other fields such as wireless communications where functionals of the aforementioned resolvents allow one to assess the communication performance across multi-user multi-antenna channels.

MSC:

60B20 Random matrices (probabilistic aspects)
15B52 Random matrices (algebraic aspects)
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Citations:

Zbl 1398.62160

Software:

ElemStatLearn; ISLR
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] O. Ajanki, L. Erdös and T. Kruger Quadratic vector equations on complex upper half-plane (2015).
[2] G. Anderson, A. Guionnet and O. Zeitouni, An Introduction to Random Matrices. Vol. 118 of Cambridge Studies Advanced Math. (2009).
[3] Z.D. Bai and J.W. Silverstein, No eigenvalues outside the support of the limiting spectral distribution of large dimensional sample covariance matrices. Ann. Probab. (1998) 26 316-345. · Zbl 0937.60017 · doi:10.1214/aop/1022855421
[4] J. Baik, G. Ben Arous and S. Péché, Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab.33 (2005) 1643-1697. · Zbl 1086.15022 · doi:10.1214/009117905000000233
[5] F. Benaych-Georges and R.N. Rao, The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. (2011) 227 494-521. · Zbl 1226.15023 · doi:10.1016/j.aim.2011.02.007
[6] F. Benaych-Georges and R.N. Rao, The singular values and vectors of low rank perturbations of large rectangular random matrices. J. Multivariate Anal.111 (2012) 120-135. · Zbl 1252.15039 · doi:10.1016/j.jmva.2012.04.019
[7] M. Capitaine, Additive/multiplicative free subordination property and limiting eigenvectors of spiked additive deformations of Wigner matrices and spiked sample covariance matrices. J. Theor. Probab.26 (2013) 595-648. · Zbl 1279.15026 · doi:10.1007/s10959-012-0416-5
[8] F. Chapon, R. Couillet, W. Hachem and X. Mestre, The outliers among the singular values of large rectangular random matrices with additive fixed rank deformation. Markov Process. Relat. Fields20 (2014) 183-228. · Zbl 1303.15042
[9] R. Couillet and W. Hachem, Analysis of the limiting spectral measure of large random matrices of the separable covariance type. Random Matrices: Theory Appl.3 (2014) 1450016. · Zbl 1305.15078 · doi:10.1142/S2010326314500166
[10] R. Couillet and F. Benaych-Georges, Kernel spectral clustering of large dimensional data. Electron. J. Stat.10 (2016) 1393-1454. · Zbl 1398.62160 · doi:10.1214/16-EJS1144
[11] R. Couillet, M. Debbah and J.W. Silverstein, A deterministic equivalent for the analysis of correlated MIMO multiple access channels. IEEE Trans. Inform. Theory57 (2011) 3493-3514. · Zbl 1365.94123 · doi:10.1109/TIT.2011.2133151
[12] L. Erdös, B. Schlein and H.-T. Yau, Semicircle law on short scales and delocalization of eigenvectors for Wigner random matrices. Ann. Prob.37 (2009). · Zbl 1175.15028
[13] T. Hastie, R. Tibshirani and J. Friedman, The elements of statistical learning. Data mining, inference, and prediction. Springer Series in Statistics, 3nd edition. Springer, New York (2009). · Zbl 1273.62005
[14] R.A. Horn and C.R. Johnson, Matrix Analysis. Cambridge University Press (2013). · Zbl 1267.15001
[15] R.A. Horn and C.R. Johnson, Topics in Matrix Analysis. Cambridge University Press (1991). · Zbl 0729.15001
[16] G. James, D. Witten, T. Hastie and R. Tibshirani, An introduction to statistical learning. With applications in R. Vol. 103 of Springer Texts in Statistics. Springer, New York (2013). · Zbl 1281.62147
[17] I.M. Johnstone, On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist.29 (2001) 295327. · Zbl 1016.62078 · doi:10.1214/aos/1009210544
[18] A. Kammoun, M. Kharouf, W. Hachem and J. Najim, A central limit theorem for the SINR at the LMMSE estimator output for large-dimensional signals. IEEE Trans. Inform. Theory55 (2009) 5048-5063. · Zbl 1367.94096 · doi:10.1109/TIT.2009.2030463
[19] R. Kannan and S. Vempala, Spectral algorithms. Found. Trends Theoret. Comput. Sci.4 (2009) 157-288. · Zbl 1191.68852 · doi:10.1561/0400000025
[20] V. Kargin, A concentration inequality and a local law for the sum of two random matrices. Probab. Theory Related Fields154 (2012) 677-702. · Zbl 1260.60015 · doi:10.1007/s00440-011-0381-4
[21] P. Loubaton and P. Vallet, Almost sure localization of the eigenvalues in a Gaussian information plus noise model. Applications to the spiked models. Electron. J. Probab.16 (2011) 1934-1959. · Zbl 1245.15039 · doi:10.1214/EJP.v16-943
[22] U. von Luxburg, A tutorial on spectral clustering. Stat. Comput.17 (2007) 395-416. · doi:10.1007/s11222-007-9033-z
[23] V.A. Marcenko and L.A. Pastur, Distribution of eigenvalues for some sets of random matrices. Sb. Math.1 (1967) 457-483. · Zbl 0162.22501 · doi:10.1070/SM1967v001n04ABEH001994
[24] J.W. Silverstein and S. Choi, Analysis of the limiting spectral distribution of large dimensional random matrices. J. Multivariate Anal.54 (1995) 295-309. · Zbl 0872.60013 · doi:10.1006/jmva.1995.1058
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.