×

Estimation of functionals of sparse covariance matrices. (English) Zbl 1327.62338

Summary: High-dimensional statistical tests often ignore correlations to gain simplicity and stability leading to null distributions that depend on functionals of correlation matrices such as their Frobenius norm and other \(\ell_{r}\) norms. Motivated by the computation of critical values of such tests, we investigate the difficulty of estimation the functionals of sparse correlation matrices. Specifically, we show that simple plug-in procedures based on thresholded estimators of correlation matrices are sparsity-adaptive and minimax optimal over a large class of correlation matrices. Akin to previous results on functional estimation, the minimax rates exhibit an elbow phenomenon. Our results are further illustrated in simulated data as well as an empirical study of data arising in financial econometrics.

MSC:

62H12 Estimation in multivariate analysis
62H15 Hypothesis testing in multivariate analysis
62C20 Minimax procedures in statistical decision theory
62H25 Factor analysis and principal components; correspondence analysis
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] Amini, A. A. and Wainwright, M. J. (2009). High-dimensional analysis of semidefinite relaxations for sparse principal components. Ann. Statist. 37 2877-2921. · Zbl 1173.62049
[2] Arias-Castro, E., Bubeck, S. and Lugosi, G. (2015). Detecting positive correlations in a multivariate sample. Bernoulli 21 209-241. · Zbl 1359.62208
[3] Bai, Z. and Saranadasa, H. (1996). Effect of high-dimension: By an example of a two sample problem. Statist. Sinica 6 311-329. · Zbl 0848.62030
[4] Berthet, Q. and Rigollet, P. (2013a). Complexity theoretic lower bounds for sparse principal component detection. J. Mach. Learn. Res. 30 1046-1066.
[5] Berthet, Q. and Rigollet, P. (2013b). Optimal detection of sparse principal components in high-dimension. Ann. Statist. 41 1780-1815. · Zbl 1277.62155
[6] Bickel, P. J. and Levina, E. (2008a). Covariance regularization by thresholding. Ann. Statist. 36 2577-2604. · Zbl 1196.62062
[7] Bickel, P. J. and Levina, E. (2008b). Regularized estimation of large covariance matrices. Ann. Statist. 36 199-227. · Zbl 1132.62040
[8] Bickel, P. J. and Ritov, Y. (1988). Estimating integrated squared density derivatives: Sharp best order of convergence estimates. Sankhyā Ser. A 50 381-393. · Zbl 0676.62037
[9] Birnbaum, A., Johnstone, I. M., Nadler, B. and Paul, D. (2013). Minimax bounds for sparse PCA with noisy high-dimensional data. Ann. Statist. 41 1055-1084. · Zbl 1292.62071
[10] Butucea, C. (2007). Goodness-of-fit testing and quadratic functional estimation from indirect observations. Ann. Statist. 35 1907-1930. · Zbl 1126.62028
[11] Butucea, C. and Meziani, K. (2011). Quadratic functional estimation in inverse problems. Stat. Methodol. 8 31-41. · Zbl 1213.62134
[12] Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106 672-684. · Zbl 1232.62086
[13] Cai, T. T. and Low, M. G. (2005). Nonquadratic estimators of a quadratic functional. Ann. Statist. 33 2930-2956. · Zbl 1085.62055
[14] Cai, T. T. and Low, M. G. (2006). Optimal adaptive estimation of a quadratic functional. Ann. Statist. 34 2298-2325. · Zbl 1110.62048
[15] Cai, T. T., Ma, Z. and Wu, Y. (2013). Sparse PCA: Optimal rates and adaptive estimation. Ann. Statist. 41 3074-3110. · Zbl 1288.62099
[16] Cai, T., Ma, Z. and Wu, Y. (2015). Optimal estimation and rank detection for sparse spiked covariance matrices. Probab. Theory Related Fields 161 781-815. · Zbl 1314.62130
[17] Cai, T. T., Ren, Z. and Zhou, H. H. (2013). Optimal rates of convergence for estimating Toeplitz covariance matrices. Probab. Theory Related Fields 156 101-143. · Zbl 06176807
[18] Cai, T. T. and Yuan, M. (2012). Adaptive covariance matrix estimation through block thresholding. Ann. Statist. 40 2014-2042. · Zbl 1257.62060
[19] Cai, T. T., Zhang, C.-H. and Zhou, H. H. (2010). Optimal rates of convergence for covariance matrix estimation. Ann. Statist. 38 2118-2144. · Zbl 1202.62073
[20] Cai, T. T. and Zhou, H. H. (2012). Minimax estimation of large covariance matrices under \(\ell_{1}\)-norm. Statist. Sinica 22 1319-1349. · Zbl 1266.62036
[21] Chen, S. X. and Qin, Y.-L. (2010). A two-sample test for high-dimensional data with applications to gene-set testing. Ann. Statist. 38 808-835. · Zbl 1183.62095
[22] Donoho, D. L. and Nussbaum, M. (1990). Minimax quadratic estimation of a quadratic functional. J. Complexity 6 290-323. · Zbl 0724.62039
[23] Efromovich, S. and Low, M. (1996). On optimal adaptive estimation of a quadratic functional. Ann. Statist. 24 1106-1125. · Zbl 0865.62024
[24] El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. 36 2717-2756. · Zbl 1196.62064
[25] Fama, E. F. and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics 33 3-56. · Zbl 1131.91335
[26] Fan, J. (1991). On the estimation of quadratic functionals. Ann. Statist. 19 1273-1294. · Zbl 0729.62076
[27] Fan, J., Fan, Y. and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model. J. Econometrics 147 186-197. · Zbl 1429.62185
[28] Fan, J., Liao, Y. and Mincheva, M. (2011). High-dimensional covariance matrix estimation in approximate factor models. Ann. Statist. 39 3320-3356. · Zbl 1246.62151
[29] Fan, J., Liao, Y. and Mincheva, M. (2013). Large covariance estimation by thresholding principal orthogonal complements. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 603-680.
[30] Fan, J., Rigollet, P. and Wang, W. (2015). Supplement to “Estimation of functionals of sparse covariance matrices.” . · Zbl 1327.62338
[31] Foucart, S. and Rauhut, H. (2013). A Mathematical Introduction to Compressive Sensing . Birkhäuser/Springer, New York. · Zbl 1315.94002
[32] Hall, P. and Marron, J. S. (1987). Estimation of integrated squared density derivatives. Statist. Probab. Lett. 6 109-115. · Zbl 0628.62029
[33] Ibragimov, I. A., Nemirovskiĭ, A. S. and Khas’minskiĭ, R. Z. (1987). Some problems of nonparametric estimation in Gaussian white noise. Theory Probab. Appl. 31 391-406. · Zbl 0623.62028
[34] Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high-dimensions. J. Amer. Statist. Assoc. 104 682-693. · Zbl 1388.62174
[35] Jung, S. and Marron, J. S. (2009). PCA consistency in high-dimension, low sample size context. Ann. Statist. 37 4104-4130. · Zbl 1191.62108
[36] Klemelä, J. (2006). Sharp adaptive estimation of quadratic functionals. Probab. Theory Related Fields 134 539-564. · Zbl 1082.62032
[37] Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist. 37 4254-4278. · Zbl 1191.62101
[38] Levina, E. and Vershynin, R. (2012). Partial estimation of covariance matrices. Probab. Theory Related Fields 153 405-419. · Zbl 1318.62179
[39] Lintner, J. (1965). The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. The Review of Economics and Statistics 47 13-37.
[40] Ma, Z. (2013). Sparse principal component analysis and iterative thresholding. Ann. Statist. 41 772-801. · Zbl 1267.62074
[41] Mossin, J. (1966). Equilibrium in a capital asset market. Econometrica 34 768-783.
[42] Nemirovski, A. (2000). Topics in nonparametric statistics. In Lectures on Probability Theory and Statistics ( Saint-Flour , 1998). Lecture Notes in Math. 1738 85-277. Springer, Berlin. · Zbl 0998.62033
[43] Nemirovskiĭ, A. S. and Khas’minskiĭ, R. Z. (1987). Nonparametric estimation of the functionals of the products of a signal observed in white noise. Problemy Peredachi Informatsii 23 27-38.
[44] Onatski, A., Moreira, M. J. and Hallin, M. (2013). Asymptotic power of sphericity tests for high-dimensional data. Ann. Statist. 41 1204-1231. · Zbl 1293.62125
[45] Paul, D. and Johnstone, I. M. (2012). Augmented sparse principal component analysis for high-dimensional data. Available at . arXiv:1202.1242v1
[46] Pesaran, M. H. and Yamagata, T. (2012). Testing capm with a large number of assets. IZA Discussion Papers 6469, Institute for the Study of Labor.
[47] Ravikumar, P., Wainwright, M. J., Raskutti, G. and Yu, B. (2011). High-dimensional covariance estimation by minimizing \(\ell_{1}\)-penalized log-determinant divergence. Electron. J. Stat. 5 935-980. · Zbl 1274.62190
[48] Rigollet, P. and Tsybakov, A. B. (2012). Comment: “Minimax estimation of large covariance matrices under \(\ell_{1}\)-norm” [MR3027084]. Statist. Sinica 22 1358-1367. · Zbl 1295.62057
[49] Rothman, A. J., Levina, E. and Zhu, J. (2009). Generalized thresholding of large covariance matrices. J. Amer. Statist. Assoc. 104 177-186. · Zbl 1388.62170
[50] Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. J. Finance 19 425-442.
[51] Srivastava, M. S. and Du, M. (2008). A test for the mean vector with fewer observations than the dimension. J. Multivariate Anal. 99 386-402. · Zbl 1148.62042
[52] Tsybakov, A. B. (2009). Introduction to Nonparametric Estimation . Springer, New York. · Zbl 1176.62032
[53] Verzelen, N. (2012). Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons. Electron. J. Stat. 6 38-90. · Zbl 1334.62120
[54] Vu, V. and Lei, J. (2012). Minimax rates of estimation for sparse PCA in high-dimensions. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics April 21 - 23, 2012, JMLR W&CP 22 1278-1286.
[55] Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal component analysis. J. Comput. Graph. Statist. 15 265-286.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.