×

Fundamental limits of detection in the spiked Wigner model. (English) Zbl 1450.62073

One of the fundamental tasks in machine learning is low-rank information extraction from a noise-corrupted data matrix. This paper deals with the fundamental limits of spike detection in the rank-one spiked Wigner model. It is proved that the logarithm of the likelihood ratio has Gaussian fluctuations below the reconstruction threshold.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H25 Factor analysis and principal components; correspondence analysis
62H15 Hypothesis testing in multivariate analysis
60G15 Gaussian processes
60F05 Central limit and other weak theorems
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Aizenman, M., Lebowitz, J. L. and Ruelle, D. (1987). Some rigorous results on the Sherrington-Kirkpatrick spin glass model. Comm. Math. Phys. 112 3-20. · Zbl 1108.82312 · doi:10.1007/BF01217677
[2] Amini, A. A. and Wainwright, M. J. (2009). High-dimensional analysis of semidefinite relaxations for sparse principal components. Ann. Statist. 37 2877-2921. · Zbl 1173.62049 · doi:10.1214/08-AOS664
[3] Bai, Z. and Yao, J. (2008). Central limit theorems for eigenvalues in a spiked population model. Ann. Inst. Henri Poincaré Probab. Stat. 44 447-474. · Zbl 1274.62129 · doi:10.1214/07-AIHP118
[4] Bai, Z. and Yao, J. (2012). On sample eigenvalues in a generalized spiked population model. J. Multivariate Anal. 106 167-177. · Zbl 1301.62049 · doi:10.1016/j.jmva.2011.10.009
[5] Baik, J., Ben Arous, G. and Péché, S. (2005). Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33 1643-1697. · Zbl 1086.15022 · doi:10.1214/009117905000000233
[6] Baik, J. and Lee, J. O. (2016). Fluctuations of the free energy of the spherical Sherrington-Kirkpatrick model. J. Stat. Phys. 165 185-224. · Zbl 1359.82011 · doi:10.1007/s10955-016-1610-0
[7] Baik, J. and Lee, J. O. (2017). Fluctuations of the free energy of the spherical Sherrington-Kirkpatrick model with ferromagnetic interaction. Ann. Henri Poincaré 18 1867-1917. · Zbl 1376.82103 · doi:10.1007/s00023-017-0562-5
[8] Baik, J. and Silverstein, J. W. (2006). Eigenvalues of large sample covariance matrices of spiked population models. J. Multivariate Anal. 97 1382-1408. · Zbl 1220.15011 · doi:10.1016/j.jmva.2005.08.003
[9] Banerjee, D. (2018). Contiguity and non-reconstruction results for planted partition models: The dense case. Electron. J. Probab. 23 Paper No. 18, 28. · Zbl 1387.05230
[10] Banerjee, D. and Ma, Z. (2018). Asymptotic normality and analysis of variance of log-likelihood ratios in spiked random matrix models. arXiv preprint arXiv:1804.00567.
[11] Banks, J., Moore, C., Vershynin, R., Verzelen, N. and Xu, J. (2017). Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization. In IEEE International Symposium on Information Theory (ISIT) 1137-1141. IEEE. · Zbl 1401.94065 · doi:10.1109/TIT.2018.2810020
[12] Barbier, J., Dia, M., Macris, N., Krzakala, F., Lesieur, T. and Zdeborová, L. (2016). Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula. In Advances in Neural Information Processing Systems (NIPS) 424-432.
[13] Benaych-Georges, F. and Nadakuditi, R. R. (2011). The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. 227 494-521. · Zbl 1226.15023 · doi:10.1016/j.aim.2011.02.007
[14] Berthet, Q. and Rigollet, P. (2013). Optimal detection of sparse principal components in high dimension. Ann. Statist. 41 1780-1815. · Zbl 1277.62155 · doi:10.1214/13-AOS1127
[15] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford Univ. Press, Oxford. · Zbl 1279.60005
[16] Capitaine, M., Donati-Martin, C. and Féral, D. (2009). The largest eigenvalues of finite rank deformation of large Wigner matrices: Convergence and nonuniversality of the fluctuations. Ann. Probab. 37 1-47. · Zbl 1163.15026 · doi:10.1214/08-AOP394
[17] Chatterjee, S. (2014). Superconcentration and Related Topics. Springer Monographs in Mathematics. Springer, Cham. · Zbl 1288.60001
[18] Deshpande, Y., Abbé, E. and Montanari, A. (2016). Asymptotic mutual information for the binary stochastic block model. In IEEE International Symposium on Information Theory (ISIT) 185-189. · Zbl 1383.62021
[19] Dobriban, E. (2017). Sharp detection in PCA under correlations: All eigenvalues matter. Ann. Statist. 45 1810-1833. · Zbl 1486.62182 · doi:10.1214/16-AOS1514
[20] El Alaoui, A. and Jordan, M. I. (2018). Detection limits in the high-dimensional spiked rectangular model. In Proceedings of the 31st Conference on Learning Theory (COLT) 75 410-438.
[21] El Alaoui, A. and Krzakala, F. (2018). Estimation in the spiked Wigner model: A short proof of the replica formula. In IEEE International Symposium on Information Theory (ISIT) 1874-1878.
[22] El Alaoui, A., Krzakala, F. and Jordan, M. (2020). Supplement to “Fundamental limits of detection in the spiked Wigner model.” https://doi.org/10.1214/19-AOS1826SUPP.
[23] Féral, D. and Péché, S. (2007). The largest eigenvalue of rank one deformation of large Wigner matrices. Comm. Math. Phys. 272 185-228. · Zbl 1136.82016
[24] Franz, S. and Parisi, G. (1995). Recipes for metastable states in spin glasses. J. Phys., I 5 1401-1415.
[25] Franz, S. and Parisi, G. (1998). Effective potential in glassy systems: Theory and simulations. Phys. A 261 317-339.
[26] Guerra, F. (2001). Sum rules for the free energy in the mean field spin glass model. In Mathematical Physics in Mathematics and Physics (Siena, 2000). Fields Inst. Commun. 30 161-170. Amer. Math. Soc., Providence, RI. · Zbl 1009.82011
[27] Guerra, F. (2003). Broken replica symmetry bounds in the mean field spin glass model. Comm. Math. Phys. 233 1-12. · Zbl 1013.82023 · doi:10.1007/s00220-002-0773-5
[28] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295-327. · Zbl 1016.62078 · doi:10.1214/aos/1009210544
[29] Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104 682-693. · Zbl 1388.62174 · doi:10.1198/jasa.2009.0121
[30] Johnstone, I. M. and Onatski, A. (2015). Testing in high-dimensional spiked models. arXiv preprint arXiv:1509.07269.
[31] Krzakala, F., Xu, J. and Zdeborová, L. (2016). Mutual information in rank-one matrix estimation. In Information Theory Workshop (ITW) 71-75.
[32] Ledoit, O. and Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann. Statist. 30 1081-1102. · Zbl 1029.62049 · doi:10.1214/aos/1031689018
[33] Lelarge, M. and Miolane, L. (2019). Fundamental limits of symmetric low-rank matrix estimation. Probab. Theory Related Fields 173 859-929. · Zbl 1411.60014 · doi:10.1007/s00440-018-0845-x
[34] Lesieur, T., Krzakala, F. and Zdeborová, L. (2015). Phase transitions in sparse PCA. In IEEE International Symposium on Information Theory (ISIT) 1635-1639.
[35] Mézard, M., Parisi, G. and Virasoro, M. A. (1987). Spin Glass Theory and Beyond. World Scientific Lecture Notes in Physics 9. World Scientific Co., Inc., Teaneck, NJ. · Zbl 0992.82500
[36] Nadler, B. (2008). Finite sample approximation results for principal component analysis: A matrix perturbation approach. Ann. Statist. 36 2791-2817. · Zbl 1168.62058 · doi:10.1214/08-AOS618
[37] Nishimori, H. (2001). Statistical Physics of Spin Glasses and Information Processing: An Introduction. International Series of Monographs on Physics 111. Oxford Univ. Press, New York. Translated from the 1999 Japanese original. · Zbl 1103.82002
[38] Onatski, A., Moreira, M. J. and Hallin, M. (2013). Asymptotic power of sphericity tests for high-dimensional data. Ann. Statist. 41 1204-1231. · Zbl 1293.62125 · doi:10.1214/13-AOS1100
[39] Onatski, A., Moreira, M. J. and Hallin, M. (2014). Signal detection in high dimension: The multispiked case. Ann. Statist. 42 225-254. · Zbl 1296.62123 · doi:10.1214/13-AOS1181
[40] Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statist. Sinica 17 1617-1642. · Zbl 1134.62029
[41] Péché, S. (2006). The largest eigenvalue of small rank perturbations of Hermitian random matrices. Probab. Theory Related Fields 134 127-173. · Zbl 1088.15025 · doi:10.1007/s00440-005-0466-z
[42] Péché, S. (2014). Deformed ensembles of random matrices. In Proceedings of the International Congress of Mathematicians—Seoul 2014. Vol. III 1159-1174. Kyung Moon Sa, Seoul. · Zbl 1373.60016
[43] Perry, A., Wein, A. S., Bandeira, A. S. and Moitra, A. (2018). Optimality and sub-optimality of PCA I: Spiked random matrix models. Ann. Statist. 46 2416-2451. · Zbl 1404.62065 · doi:10.1214/17-AOS1625
[44] Talagrand, M. (2011a). Mean Field Models for Spin Glasses. Volume I: Basic Examples. Ergebnisse der Mathematik und Ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics] 54. Springer, Berlin. · Zbl 1214.82002
[45] Talagrand, M. (2011b). Mean Field Models for Spin Glasses. Volume II: Advanced Replica-Symmetry and Low Temperature. Ergebnisse der Mathematik und Ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics [Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys in Mathematics] 55. Springer, Heidelberg. · Zbl 1214.82002
[46] van der Vaart, A.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.