On information plus noise kernel random matrices. (English) Zbl 1200.62056

Summary: Kernel random matrices have attracted a lot of interest in recent years, from both practical and theoretical standpoints. Most of the theoretical work so far has focused on the case were the data is sampled from a low-dimensional structure. Very recently, the first results concerning kernel random matrices with high-dimensional input data were obtained, in a setting where the data was sampled from a genuinely high-dimensional structure-similar to standard assumptions in random matrix theory. We consider the case where the data is of the type “information + noise.” In other words, each observation is the sum of two independent elements: one sampled from a “low-dimensional” structure, the signal part of the data, the other being high-dimensional noise, normalized to not overwhelm but still affect the signal. We consider two types of noise, spherical and elliptical.
In the spherical setting, we show that the spectral properties of kernel random matrices can be understood from a new kernel matrix, computed only from the signal part of the data, but using (in general) a slightly different kernel. The Gaussian kernel has some special properties in this setting. The elliptical setting, which is important from a robustness standpoint, is less prone to easy interpretation.


62H10 Multivariate distribution of statistics
15B52 Random matrices (algebraic aspects)
60F99 Limit theorems in probability theory
Full Text: DOI arXiv


[1] Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis , 3rd ed. Wiley, Hoboken, NJ. · Zbl 1039.62044
[2] Bach, F. R. and Jordan, M. I. (2003). Kernel independent component analysis. J. Mach. Learn. Res. 3 1-48. · Zbl 1088.68689
[3] Bai, Z. D. (1999). Methodologies in spectral analysis of large-dimensional random matrices, a review. Statist. Sinica 9 611-677. · Zbl 0949.60077
[4] Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15 1373-1396. Available at . · Zbl 1085.68119
[5] Belkin, M. and Niyogi, P. (2008). Towards a theoretical foundation for Laplacian-based manifold methods. J. Comput. System Sci. 74 1289-1308. · Zbl 1157.68056
[6] Bhatia, R. (1997). Matrix Analysis. Graduate Texts in Mathematics 169 . Springer, New York. · Zbl 0863.15001
[7] Cressie, N. A. C. (1993). Statistics for Spatial Data . Wiley, New York. · Zbl 0799.62002
[8] El Karoui, N. (2008). Operator norm consistent estimation of large dimensional sparse covariance matrices. Ann. Statist. 36 2717-2756. · Zbl 1196.62064
[9] El Karoui, N. (2009). Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond. Ann. Appl. Probab. 19 2362-2405. · Zbl 1255.62156
[10] El Karoui, N. (2010). The spectrum of kernel random matrices. Ann. Statist. 38 1-50. · Zbl 1181.62078
[11] Izenman, A. J. (2008). Modern Multivariate Statistical Techniques . Springer, New York. · Zbl 1155.62040
[12] Johnstone, I. (2001). On the distribution of the largest eigenvalue in principal component analysis. Ann. Statist. 29 295-327. · Zbl 1016.62078
[13] Johnstone, I. M. (2007). High dimensional statistical inference and random matrices. In International Congress of Mathematicians I 307-333. Eur. Math. Soc., Zürich. · Zbl 1120.62033
[14] Koltchinskii, V. and Giné, E. (2000). Random matrix approximation of spectra of integral operators. Bernoulli 6 113-167. · Zbl 0949.60078
[15] Ledoux, M. (2001). The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs 89 . Amer. Math. Soc., Providence, RI. · Zbl 0995.60002
[16] Marčenko, V. A. and Pastur, L. A. (1967). Distribution of eigenvalues in certain sets of random matrices. Mat. Sb. (N.S.) 72 507-536. · Zbl 0152.16101
[17] Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning . MIT Press, Cambridge, MA. · Zbl 1177.68165
[18] Schechtman, G. and Zinn, J. (2000). Concentration on the l p n ball. In Geometric Aspects of Functional Analysis. Lecture Notes in Math. 1745 245-256. Springer, Berlin. · Zbl 0971.46009
[19] Schölkopf, B. and Smola, A. J. (2002). Learning with Kernels . MIT Press, Cambridge, MA. · Zbl 1019.68094
[20] Stewart, G. W. and Sun, J. G. (1990). Matrix Perturbation Theory . Academic Press, Boston, MA. · Zbl 0706.65013
[21] van der Vaart, A. W. (1998). Asymptotic Statistics . Cambridge Univ. Press, Cambridge. · Zbl 0910.62001
[22] von Luxburg, U., Belkin, M. and Bousquet, O. (2008). Consistency of spectral clustering. Ann. Statist. 36 555-586. Available at . · Zbl 1133.62045
[23] Williams, C. and Seeger, M. (2000). The effect of the input density distribution on kernel-based classifiers. International Conference on Machine Learning 17 1159-1166.
[24] Zwald, L., Bousquet, O. and Blanchard, G. (2004). Statistical properties of kernel principal component analysis. In Learning Theory. Lecture Notes in Computer Science 3120 594-608. Springer, Berlin. · Zbl 1078.68133
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.