Bickel, Peter J.; Levina, Elizaveta Covariance regularization by thresholding. (English) Zbl 1196.62062 Ann. Stat. 36, No. 6, 2577-2604 (2008). Summary: This paper considers regularizing a covariance matrix of \(p\) variables estimated from \(n\) observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Gaussian, and \((\log p)/n\rightarrow 0\), and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general cross-validation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data. Cited in 5 ReviewsCited in 415 Documents MSC: 62H12 Estimation in multivariate analysis 62F12 Asymptotic properties of parametric estimators 62F40 Bootstrap, jackknife and other resampling methods 65C60 Computational problems in statistics (MSC2010) 62P12 Applications of statistics to environmental and related topics Keywords:covariance estimation; regularization; sparsity; thresholding; large \(p\) small \(n\); high dimension low sample size; climate data Software:glasso; DSPCA; EBayesThresh × Cite Format Result Cite Review PDF Full Text: DOI arXiv References: [1] Abramovich, F., Benjamini, Y., Donoho, D. and Johnstone, I. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584-653. · Zbl 1092.62005 · doi:10.1214/009053606000000074 [2] Bickel, P. J. and Levina, E. (2004). Some theory for Fisher’s linear discriminant function, “naive Bayes,” and some alternatives when there are many more variables than observations. Bernoulli 10 989-1010. · Zbl 1064.62073 · doi:10.3150/bj/1106314847 [3] Bickel, P. J. and Levina, E. (2008). Regularized estimation of large covariance matrices. Ann. Statist. 36 199-227. · Zbl 1132.62040 · doi:10.1214/009053607000000758 [4] Bickel, P. J., Ritov, Y. and Zakai, A. (2006). Some theory for generalized boosting algorithms. J. Mach. Learn. Res. 7 705-732. · Zbl 1222.68148 [5] d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2007). First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30 56-66. · Zbl 1156.90423 · doi:10.1137/060670985 [6] d’Aspremont, A., El Ghaoui, L., Jordan, M. I. and Lanckriet, G. R. G. (2007). A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49 434-448. · Zbl 1128.90050 · doi:10.1137/050645506 [7] Dey, D. K. and Srinivasan, C. (1985). Estimation of a covariance matrix under Stein’s loss. Ann. Statist. 13 1581-1591. · Zbl 0582.62042 · doi:10.1214/aos/1176349756 [8] Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika 81 425-455. JSTOR: · Zbl 0815.62019 · doi:10.1093/biomet/81.3.425 [9] Dudoit, S. and van der Laan, M. J. (2005). Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Statist. Methodol. 2 131-154. · Zbl 1248.62004 · doi:10.1016/j.stamet.2005.02.003 [10] El Karoui, N. (2007a). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. · Zbl 1196.62064 · doi:10.1214/07-AOS559 [11] El Karoui, N. (2007b). Spectrum estimation for large-dimensional covariance matrices using random matrix theory. Ann. Statist. · Zbl 1168.62052 · doi:10.1214/07-AOS581 [12] El Karoui, N. (2007c). Tracy-Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. Ann. Probab. 35 663-714. · Zbl 1117.60020 · doi:10.1214/009117906000000917 [13] Fan, J., Fan, Y. and Lv, J. (2008). High-dimensional covariance matrix estimation using a factor model. J. Econometrics . · Zbl 1429.62185 · doi:10.1016/j.jeconom.2008.09.017 [14] Fan, J., Feng, Y. and Wu, Y. (2007). Network exploration via the adaptive LASSO and SCAD penalties. Unpublished manuscript. · Zbl 1166.62040 [15] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. JSTOR: · Zbl 1073.62547 · doi:10.1198/016214501753382273 [16] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432-441. · Zbl 1143.62076 · doi:10.1093/biostatistics/kxm045 [17] Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. J. Multivariate Anal. 98 227-255. · Zbl 1105.62091 · doi:10.1016/j.jmva.2006.08.003 [18] Golub, G. H. and Van Loan, C. F. (1989). Matrix Computations , 2nd ed. Johns Hopkins Univ. Press, Baltimore, MD. · Zbl 0733.65016 [19] Gyorfi, L., Kohler, M., Krzyzak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression . Springer, New York. [20] Haff, L. R. (1980). Empirical Bayes estimation of the multivariate normal covariance matrix. Ann. Statist. 8 586-597. · Zbl 0441.62045 · doi:10.1214/aos/1176345010 [21] Huang, J., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood. Biometrika 93 85-98. · Zbl 1152.62346 · doi:10.1093/biomet/93.1.85 [22] Johnstone, I. and Silverman, B. (2005). Empirical Bayes selection of wavelet thresholds. Ann. Statist. 33 1700-1752. · Zbl 1078.62005 · doi:10.1214/009053605000000345 [23] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295-327. · Zbl 1016.62078 · doi:10.1214/aos/1009210544 [24] Johnstone, I. M. and Lu, A. Y. (2004). Sparse principal components analysis. Unpublished manuscript. [25] Lam, C. and Fan, J. (2007). Sparsistency and rates of convergence in large covariance matrices estimation. Manuscript. · Zbl 1191.62101 [26] Ledoit, O. and Wolf, M. (2003). A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88 365-411. · Zbl 1032.62050 · doi:10.1016/S0047-259X(03)00096-4 [27] Ledoux, M. (2001). The Concentration of Measure Phenomenon . Amer. Math. Soc., Providence, RI. · Zbl 0995.60002 [28] Levina, E., Rothman, A. J. and Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested Lasso penalty. Ann. Appl. Statist. 2 245-263. · Zbl 1137.62338 · doi:10.1214/07-AOAS139 [29] Marčenko, V. A. and Pastur, L. A. (1967). Distributions of eigenvalues of some sets of random matrices. Math. USSR-Sb 1 507-536. · Zbl 0162.22501 · doi:10.1070/SM1967v001n04ABEH001994 [30] Paul, D. (2007). Asymptotics of the leading sample eigenvalues for a spiked covariance model. Statist. Sinica 17 1617-1642. · Zbl 1134.62029 [31] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494-515. · Zbl 1320.62135 · doi:10.1214/08-EJS176 [32] Saulis, L. and Statulevičius, V. A. (1991). Limit Theorems for Large Deviations . Kluwer, Dordrecht. · Zbl 0744.60028 [33] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics . Springer, New York. · Zbl 0862.60002 [34] Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika 90 831-844. · Zbl 1436.62347 · doi:10.1093/biomet/90.4.831 [35] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19-35. · Zbl 1142.62408 · doi:10.1093/biomet/asm018 [36] Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal components analysis. J. Comput. Graph. Statist. 15 265-286. · doi:10.1198/106186006X113430 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.