Regularized estimation of large covariance matrices. (English) Zbl 1132.62040

Summary: This paper considers estimating a covariance matrix of \(p\) variables from \(n\) observations by either banding or tapering the sample covariance matrix, or estimating a banded version of the inverse of the covariance. We show that these estimates are consistent in the operator norm as long as \((\log p)/n\rightarrow 0\), and obtain explicit rates. The results are uniform over some fairly natural well-conditioned families of covariance matrices. We also introduce an analogue of the Gaussian white noise model and show that if the population covariance is embeddable in that model and well-conditioned, then the banded approximations produce consistent estimates of the eigenvalues and associated eigenvectors of the covariance matrix. The results can be extended to smooth versions of banding and to non-Gaussian distributions with sufficiently short tails. A resampling approach is proposed for choosing the banding parameter in practice. This approach is illustrated numerically on both simulated and real data.


62H12 Estimation in multivariate analysis
62F12 Asymptotic properties of parametric estimators
15A18 Eigenvalues, singular values, and eigenvectors
62F40 Bootstrap, jackknife and other resampling methods
Full Text: DOI arXiv Euclid


[1] Anderson, T. W. (1958). An Introduction to Multivariate Statistical Analysis . Wiley, New York. · Zbl 0083.14601
[2] d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2007). First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. · Zbl 1156.90423
[3] Bai, Z. and Yin, Y. (1993). Limit of the smallest eigenvalue of a large dimensional sample covariance matrix. Ann. Probab. 21 1275-1294. · Zbl 0779.60026
[4] Bardet, J.-M., Lang, G., Oppenheim, G., Philippe, A. and Taqqu, M. (2002). Generators of long-range dependent processes. In Theory and Applications of Long-Range Dependence (P. Doukhan, G. Oppenheim and M. Taqqu, eds.) 579-623. Birkhäuser, Boston. · Zbl 1031.65010
[5] Bickel, P. J. and Levina, E. (2004). Some theory for Fisher’s linear discriminant function, “naive Bayes,” and some alternatives when there are many more variables than observations. Bernoulli 10 989-1010. · Zbl 1064.62073
[6] Bickel, P. J., Ritov, Y. and Zakai, A. (2006). Some theory for generalized boosting algorithms. J. Machine Learning Research 7 705-732. · Zbl 1222.68148
[7] Böttcher, A. (1996). Infinite matrices and projection methods. In Lectures on Operator Theory and Its Applications (P. Lancaster, ed.). Fields Institute Monographs 3 1-72. Amer. Math. Soc., Providence, RI. · Zbl 0842.65032
[8] De Vore, R. and Lorentz, G. (1993). Constructive Approximation . Springer, Berlin. · Zbl 0797.41016
[9] Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Pickard, D. (1995). Wavelet shrinkage: Asymptopia? (with discussion). J. Roy. Statist. Soc. Ser. B 57 301-369. JSTOR: · Zbl 0827.62035
[10] Fan, J., Fan, Y. and Lv, J. (2006). High dimensional covariance matrix estimation using a factor model. Technical report, Princeton Univ. · Zbl 1429.62185
[11] Fan, J. and Kreutzberger, E. (1998). Automatic local smoothing for spectral density estimation. Scand. J. Statist. 25 359-369. · Zbl 0909.62029
[12] Friedman, J. (1989). Regularized discriminant analysis. J. Amer. Statist. Assoc. 84 165-175. JSTOR:
[13] Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posteriori covariance matrices in Kalman filter variants. J. Multivariate Anal . 98 227-255. · Zbl 1105.62091
[14] Geman, S. (1980). A limit theorem for the norm of random matrices. Ann. Probab. 8 252-261. · Zbl 0428.60039
[15] Golub, G. H. and Van Loan, C. F. (1989). Matrix Computations , 2nd ed. John Hopkins Univ. Press, Baltimore. · Zbl 0733.65016
[16] Grenander, U. and Szegö, G. (1984). Toeplitz Forms and Their Applications , 2nd ed. Chelsea Publishing Company, New York. · Zbl 0611.47018
[17] Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 55-67. · Zbl 0202.17205
[18] Huang, J., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood. Biometrika 93 85-98. · Zbl 1152.62346
[19] Ibragimov, I. A. and Linnik, Y. V. (1971). Independent and Stationary Sequences of Random Variables . Wolters-Noordholf, Groningen. · Zbl 0219.60027
[20] James, W. and Stein, C. (1961). Estimation with quadratic loss. Proc. of 4th Berkeley Symp. Math. Statist. Probab. 1 361-380. Univ. California Press, Berkeley. · Zbl 1281.62026
[21] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295-327. · Zbl 1016.62078
[22] Johnstone, I. M. and Lu, A. Y. (2007). Sparse principal components analysis. J. Amer. Statist. Assoc.
[23] Kato, T. (1949). On the convergence of the perturbation method. I. Progr. Theor. Phys. 4 514-523.
[24] Kato, T. (1966). Perturbation Theory for Linear Operators . Springer, Berlin. · Zbl 0148.12601
[25] Ledoit, O. and Wolf, M. (2003). A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88 365-411. · Zbl 1032.62050
[26] Marĉenko, V. A. and Pastur, L. A. (1967). Distributions of eigenvalues of some sets of random matrices. Math. USSR-Sb. 1 507-536. · Zbl 0162.22501
[27] Meinshausen, N. and Buhlmann, P. (2006). High dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082
[28] Paul, D. (2007). Asymptotics of the leading sample eigenvalues for a spiked covariance model. Statist. Sinica.
[29] Saulis, L. and Statulevičius, V. A. (1991). Limit Theorems for Large Deviations . Kluwer Academic Publishers, Dordrecht. · Zbl 0744.60028
[30] Schur, J. (1911). Bemerkungen zur theorie der beschränkten bilinearformen mit unendlich vielen veränderlichen. J. Reine Math. 140 1-28. · JFM 42.0367.01
[31] Sz.-Nagy, B. (1946). Perturbations des transformations autoadjointes dans l’espace de Hilbert. Comment. Math. Helv. 19 347-366. · Zbl 0035.20001
[32] Wachter, K. W. (1978). The strong limits of random matrix spectra for sample matrices of independent elements. Ann. Probab. 6 1-18. JSTOR: · Zbl 0374.60039
[33] Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika 90 831-844. · Zbl 1436.62347
[34] Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal components analysis. J. Comput. Graph. Statist. 15 265-286.
[35] Zygmund, A. (1959). Trigonometric Series . Cambridge Univ. Press. · Zbl 0085.05601
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.