×

Honest confidence regions and optimality in high-dimensional precision matrix estimation. (English) Zbl 1368.62204

Summary: We propose methodology for estimation of sparse precision matrices and statistical inference for their low-dimensional parameters in a high-dimensional setting where the number of parameters \(p\) can be much larger than the sample size. We show that the novel estimator achieves minimax rates in supremum norm and the low-dimensional components of the estimator have a Gaussian limiting distribution. These results hold uniformly over the class of precision matrices with row sparsity of small order \(\sqrt{n}/\log p\) and spectrum uniformly bounded, under a sub-Gaussian tail assumption on the margins of the true underlying distribution. Consequently, our results lead to uniformly valid confidence regions for low-dimensional parameters of the precision matrix. Thresholding the estimator leads to variable selection without imposing irrepresentability conditions. The performance of the method is demonstrated in a simulation study and on real data.

MSC:

62J07 Ridge regression; shrinkage estimators (Lasso)
62H12 Estimation in multivariate analysis
62F12 Asymptotic properties of parametric estimators

Software:

glasso
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Belloni A, Chernozhukov V, Hansen C (2014) Inference on treatment effects after selection amongst high-dimensional controls. Rev Econ Stud 81(2):608-650 · Zbl 1409.62142 · doi:10.1093/restud/rdt044
[2] Belloni A, Chernozhukov V, Wang L (2011) Square-root Lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98(4):791-806 · Zbl 1228.62083 · doi:10.1093/biomet/asr043
[3] Bickel PJ, Klaassen CA, Ritov Y, Wellner JA (1993) Efficient and adaptive estimation for semiparametric models. Springer, New York · Zbl 0786.62001
[4] Bickel PJ, Levina E (2008) Covariance regularization by thresholding. Ann Statist 36(6):2577-2604 · Zbl 1196.62062 · doi:10.1214/08-AOS600
[5] Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data. Springer, New York · Zbl 1273.62015 · doi:10.1007/978-3-642-20192-9
[6] Cai T, Liu W, Luo X (2011) A constrained l1 minimization approach to sparse precision matrix estimation. J Am Statist Assoc 106:594-607 · Zbl 1232.62087 · doi:10.1198/jasa.2011.tm10155
[7] Candes E, Tao T (2007) The dantzig selector: statistical estimation when p is much larger than n. Ann Statist 35(6):2313-2351 · Zbl 1139.62019 · doi:10.1214/009053606000001523
[8] Chatterjee A, Lahiri SN (2011) Bootstrapping lasso estimators. J Am Statist Assoc 106(494):608-625 · Zbl 1232.62088 · doi:10.1198/jasa.2011.tm10159
[9] Chatterjee A, Lahiri SN (2013) Rates of convergence of the adaptive LASSO estimators to the oracle distribution and higher order refinements by the bootstrap. Ann Statist 41(3) · Zbl 1293.62153
[10] Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Statist 32(2):407-451 · Zbl 1091.62054 · doi:10.1214/009053604000000067
[11] Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:432-441 · Zbl 1143.62076 · doi:10.1093/biostatistics/kxm045
[12] Janková J, van de Geer S (2015) Confidence intervals for high-dimensional inverse covariance estimation. Electron J Statist 9:1205-1229 · Zbl 1328.62458 · doi:10.1214/15-EJS1031
[13] Javanmard A, Montanari A (2013) Model selection for high-dimensional regression under the generalized irrepresentability condition. In: Burges C, Bottou L, Welling M, Ghahramani Z, Weinberger K (eds) Advances in neural information processing systems 26:3012-3020 · Zbl 1232.62088
[14] Javanmard A, Montanari A (2014) Confidence intervals and hypothesis testing for high-dimensional regression. J Mach Learn Res 15(1):2869-2909 · Zbl 1319.62145
[15] Knight K, Fu W (2000) Asymptotics for lasso-type estimators. Ann Statist 28(5):1356-1378 · Zbl 1105.62357 · doi:10.1214/aos/1015957397
[16] Lauritzen SL (1996) Graphical models. Clarendon Press, Oxford · Zbl 0907.62001
[17] Li KC (1989) Honest confidence regions for nonparametric regression. Ann Statist 17(3):1001-1008 · Zbl 0681.62047 · doi:10.1214/aos/1176347253
[18] Mazumder R, Hastie T (2012) The Graphical Lasso: New Insights and Alternatives. Electron J Statist, pp 2125-2149 · Zbl 1295.62066
[19] Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the lasso. Ann Statist 34(3):1436-1462 · Zbl 1113.62082 · doi:10.1214/009053606000000281
[20] Ng B, Varoquaux G, P J-B, Thirion B (2013) A novel sparse group gaussian graphical model for functional connectivity estimation. Information Processing in Medical Imaging
[21] Ravikumar P, Raskutti G, Wainwright MJ, Yu B (2008) High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence. Electron J Statist 5:935-980 · Zbl 1274.62190 · doi:10.1214/11-EJS631
[22] Ren Z, Sun T, Zhang C-H, Zhou HH (2015) Asymptotic normality and optimalities in estimation of large gaussian graphical models. Ann Statist 43(3):991-1026 · Zbl 1328.62342 · doi:10.1214/14-AOS1286
[23] Rothman AJ, Bickel PJ, Levina E, Zhu J (2008) Sparse permutation invariant covariance estimation. Electron J Statist 2:494-515 · Zbl 1320.62135 · doi:10.1214/08-EJS176
[24] Sun T, Zhang C-H (2012) Sparse matrix inversion with scaled Lasso. J Mach Learn Res 14:3385-3418 · Zbl 1318.62184
[25] Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Stat Methodol 58:267-288 · Zbl 0850.62538
[26] van de Geer S (2016) Worst possible sub-directions in high-dimensional models. J Multi Anal 146:248-260 · Zbl 1334.62133 · doi:10.1016/j.jmva.2015.09.018
[27] van de Geer S, Bühlmann P, Ritov Y, Dezeure R (2013) On asymptotically optimal confidence regions and tests for high-dimensional models. Ann Statist 42(3):1166-1202 · Zbl 1305.62259 · doi:10.1214/14-AOS1221
[28] van der Vaart A (2000) Asymptotic statistics. Cambridge University Press, Cambridge · Zbl 0910.62001
[29] Yuan M (2010) High dimensional inverse covariance matrix estimation via linear programming. J Mach Learn Res 11:2261-2286 · Zbl 1242.62043
[30] Yuan M, Lin Y (2007) Model selection and estimation in the gaussian graphical model. Biometrika, page 117 · Zbl 1142.62408
[31] Zhang C-H, Zhang SS (2014) Confidence intervals for low-dimensional parameters in high-dimensional linear models. J R Stat Soc Ser B Stat Methodol 76:217-242 · Zbl 1411.62196 · doi:10.1111/rssb.12026
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.