×

Adaptive estimation of covariance matrices via Cholesky decomposition. (English) Zbl 1329.62265

Summary: This paper studies the estimation of a large covariance matrix. We introduce a novel procedure called ChoSelect based on the Cholesky factor of the inverse covariance. This method uses a dimension reduction strategy by selecting the pattern of zero of the Cholesky factor. Alternatively, ChoSelect can be interpreted as a graph estimation procedure for directed Gaussian graphical models. Our approach is particularly relevant when the variables under study have a natural ordering (e.g. time series) or more generally when the Cholesky factor is approximately sparse. ChoSelect achieves non-asymptotic oracle inequalities with respect to the Kullback-Leibler entropy. Moreover, it satisfies various adaptive properties from a minimax point of view. We also introduce and study a two-stage procedure that combines ChoSelect with the Lasso. This last method enables the practitioner to choose his own trade-off between statistical efficiency and computational complexity. Moreover, it is consistent under weaker assumptions than the Lasso. The practical performances of the different procedures are assessed on numerical examples.

MSC:

62H12 Estimation in multivariate analysis
62F35 Robustness and adaptive procedures (parametric inference)
62J05 Linear regression; mixed models

Software:

pcalg; glasso
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In, Second International Symposium on Information Theory (Tsahkadsor, 1971) . Akadémiai Kiadó, Budapest, 267-281. · Zbl 0283.62006
[2] Bach, F. (2008). model consistent lasso estimation through the bootstrap. In, Twenty-fifth International Conference on Machine Learning (ICML) .
[3] Banerjee, O., El Ghaoui, L., and d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data., J. Mach. Learn. Res. 9 , 485-516. · Zbl 1225.68149
[4] Baraud, Y., Giraud, C., and Huet, S. (2009). Gaussian model selection with an unknown variance., Ann. Statist. 37 , 2, 630-672. · Zbl 1162.62051 · doi:10.1214/07-AOS573
[5] Bickel, P. J. and Levina, E. (2008a). Covariance regularization by thresholding., Ann. Statist. 36 , 6, 2577-2604. · Zbl 1196.62062 · doi:10.1214/08-AOS600
[6] Bickel, P. J. and Levina, E. (2008b). Regularized estimation of large covariance matrices., Ann. Statist. 36 , 1, 199-227. · Zbl 1132.62040 · doi:10.1214/009053607000000758
[7] Birgé, L. (2005). A new lower bound for multiple hypothese testing., IEEE Trans. Inf. Theory 51 , 4, 1611-1615. · Zbl 1283.62030 · doi:10.1109/TIT.2005.844101
[8] Birgé, L. and Massart, P. (1998). Minimum contrast estimators on sieves: exponentntial bounds and rates of convergence., Bernoulli 4 , 3, 329-375. · Zbl 0954.62033 · doi:10.2307/3318720
[9] Birge, L. and Massart, P. (2007). Minimal penalties for Gaussian model selection., Probab. Theory Related Fields 138 , 1-2, 33-73. · Zbl 1112.62082 · doi:10.1007/s00440-006-0011-8
[10] Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). Least angle regression., Ann. Statist. 32 , 2, 407-499. · Zbl 1091.62054 · doi:10.1214/009053604000000067
[11] El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices., Ann. Statist. 36 , 6, 2717-2756. · Zbl 1196.62064 · doi:10.1214/07-AOS559
[12] Fan, J., Feng, Y., and Wu, Y. (2009). Network exploration via thea daptive lasso and scad penalties., Ann. Appl. Stat 3 , 2, 521-541. · Zbl 1166.62040 · doi:10.1214/08-AOAS215
[13] Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics 9 , 3, 432-441. · Zbl 1143.62076 · doi:10.1093/biostatistics/kxm045
[14] Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants., J. Multivariate Anal. 98 , 2, 227-255. · Zbl 1105.62091 · doi:10.1016/j.jmva.2006.08.003
[15] Huang, J., Liu, N., Pourahmadi, M., and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood., Biometrika 93 , 1, 85-98. · Zbl 1152.62346 · doi:10.1093/biomet/93.1.85
[16] Johnstone, I. (2001). On the distribution of the largest eigenvalue in principal components analysis., Ann. Statist. 29 , 2, 295-327. · Zbl 1016.62078 · doi:10.1214/aos/1009210544
[17] Johnstone, I. and Lu, A. (2004). Sparse principal components analysis. Tech. rep., Stanford, university.
[18] Kalisch, M. and Bühlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm., J. Mach. Learn. Res. 8 , 613-636. · Zbl 1222.68229
[19] Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation., Ann. Statist. 37 , 6B, 4254-4278. . · Zbl 1191.62101 · doi:10.1214/09-AOS720
[20] Lauritzen, S. L. (1996)., Graphical Models . Oxford University Press, New York. · Zbl 0907.62001
[21] Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices., J. Multivariate Anal. 88 , 2, 365-411. · Zbl 1032.62050 · doi:10.1016/S0047-259X(03)00096-4
[22] Levina, E., Rothman, A., and Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested lasso penalty., Ann. Appl. Stat 2 , 1, 245-263. · Zbl 1137.62338 · doi:10.1214/07-AOAS139
[23] Massart, P. (2007)., Concentration Inequalities and Model Selection, École d’été de probabilités de Saint Flour XXXIII . Lecture Notes in Mathematics, Vol. 1896 . Springer-Verlag. · Zbl 1170.60006
[24] McQuarrie, A. D. R. and Tsai, C.-L. (1998)., Regression and Time Series Model Selection . World Scientific. · Zbl 0907.62095
[25] Meinshausen, N. and Bühlmann, P. (2010). Stability selection., J. R. Stat. Soc. Ser. B Stat. Methodol. 72 , 4, 417-473. · doi:10.1111/j.1467-9868.2010.00740.x
[26] Rosen, D. V. (1988). Moments for the inverted wishart distribution., Scand. J. Statist. 15 , 2, 97-109. · Zbl 0663.62063
[27] Rothman, A., Bickel, P., Levina, E., and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electron. J. Stat. 2 , 494-515. · Zbl 1320.62135 · doi:10.1214/08-EJS176
[28] Verzelen, N. (2009). Technical Appendix to “Adaptive estimation of covariance matrices via cholesky, decomposition”.
[29] Verzelen, N. (2010). High-dimensional gaussian model selection on a gaussian design., Ann. Inst. H. Poincaré Probab. Statist. 46 , 2, 480-524. · Zbl 1191.62076 · doi:10.1214/09-AIHP321
[30] Wagaman, A. and Levina, E. (2009). Discovering sparse covariance structures with the isomap., Journal of Computational and Graphical Statistics 18 , 3, 551-572. · doi:10.1198/jcgs.2009.08021
[31] Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data., Biometrika 90 , 4, 831-844. · Zbl 1436.62347 · doi:10.1093/biomet/90.4.831
[32] Yu, B. (1997). Assouad, Fano, and Le Cam. In, Festschrift for Lucien Le Cam . Springer, New York, 423-435. · Zbl 0896.62032 · doi:10.1007/978-1-4612-1880-7_29
[33] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model., Biometrika 94 , 19-35. · Zbl 1142.62408 · doi:10.1093/biomet/asm018
[34] Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression., Ann. Statist. 36 , 4, 1567-1594. . · Zbl 1142.62044 · doi:10.1214/07-AOS520
[35] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso., J. Mach. Learn. Res. 7 , 2541-2563. · Zbl 1222.62008
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.