×

High-dimensional simultaneous inference with the bootstrap. (English) Zbl 06833591

Summary: We propose a residual and wild bootstrap methodology for individual and simultaneous inference in high-dimensional linear models with possibly non-Gaussian and heteroscedastic errors. We establish asymptotic consistency for simultaneous inference for parameters in groups \(G\), where \(p \gg n\), \(s_0 = o(n^{1/2}/\{\log (p) \log (|G|)^{1/2}\})\) and \(\log (|G|) = o(n^{1/7})\), with \(p\) the number of variables, \(n\) the sample size and \(s_0\) the sparsity. The theory is complemented by many empirical results. Our proposed procedures are implemented in the R-package hdi [L. Meier et al., hdi: high-dimensional inference. R package version 0.1-6 (2016)].

MSC:

62J07 Ridge regression; shrinkage estimators (Lasso)
62F40 Bootstrap, jackknife and other resampling methods
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Belloni A, Chernozhukov V, Chetverikov D, Wei Y (2015a) Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation. Preprint arXiv:1512.07619 · Zbl 1173.62054
[2] Belloni, A; Chernozhukov, V; Kato, K, Uniform post-selection inference for least absolute deviation regression and other Z-estimation problems, Biometrika, 102, 77-94, (2015) · Zbl 1345.62049
[3] Bickel P, Klaassen C, Ritov Y, Wellner J (1998) Efficient and adaptive estimation for semiparametric models. Springer, Berlin · Zbl 0894.62005
[4] Breiman, L, Heuristics of instability and stabilization in model selection, Ann Stat, 24, 2350-2383, (1996) · Zbl 0867.62055
[5] Bühlmann, P, Statistical significance in high-dimensional linear models, Bernoulli, 19, 1212-1242, (2013) · Zbl 1273.62173
[6] Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer, Berlin · Zbl 1273.62015
[7] Bühlmann, P; Geer, S, High-dimensional inference in misspecified linear models, Electron J Stat, 9, 1449-1473, (2015) · Zbl 1327.62420
[8] Bühlmann, P; Kalisch, M; Meier, L, High-dimensional statistics with a view towards applications in biology, Annu Rev Stat Appl, 1, 255-278, (2014)
[9] Chatterjee, A; Lahiri, S, Bootstrapping lasso estimators, J Am Stat Assoc, 106, 608-625, (2011) · Zbl 1232.62088
[10] Chatterjee, A; Lahiri, S, Rates of convergence of the adaptive LASSO estimators to the oracle distribution and higher order refinements by the bootstrap, Ann Stat, 41, 1232-1259, (2013) · Zbl 1293.62153
[11] Chernozhukov, V; Chetverikov, D; Kato, K, Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors, Ann Stat, 41, 2786-2819, (2013) · Zbl 1292.62030
[12] Chernozhukov V, Chetverikov D, Kato K (2014) Central limit theorems and bootstrap in high dimensions. The Annals of Probabiliy, To appear, Preprint arXiv:1412.3661 · Zbl 1377.60040
[13] Chernozhukov V, Hansen C, Spindler M (2016) hdm: high-dimensional metrics. Preprint arXiv:1608.00354
[14] Deng H, Zhang C-H (2017) Beyond Gaussian approximation: bootstrap in large scale simultaneous inference. unpublished work in progress · Zbl 1426.62183
[15] Dezeure, R; Bühlmann, P; Meier, L; Meinshausen, N, High-dimensional inference: confidence intervals, \(p\)-values and R-software hdi, Stat Sci, 30, 533-558, (2015) · Zbl 1426.62183
[16] Efron, B, Bootstrap methods: another look at the jackknife, Ann Stat, 7, 1-26, (1979) · Zbl 0406.62024
[17] Eicker F (1967) Limit theorems for regressions with unequal and dependent errors. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 59-82 · Zbl 0217.51201
[18] Foygel Barber, R; Candès, EJ, Controlling the false discovery rate via knockoffs, Ann Stat, 43, 2055-2085, (2015) · Zbl 1327.62082
[19] Freedman, DA, Bootstrapping regression models, Ann Stat, 9, 1218-1228, (1981) · Zbl 0449.62046
[20] Giné, E; Zinn, J, Necessary conditions for the bootstrap of the Mean, Ann Stat, 17, 684-691, (1989) · Zbl 0672.62026
[21] Giné, E; Zinn, J, Bootstrapping general empirical measures, Ann Probab, 18, 851-869, (1990) · Zbl 0706.62017
[22] Hall, P; Wilson, SR, Two guidelines for bootstrap hypothesis testing, Biometrics, 47, 757-762, (1991)
[23] Huber PJ (1967) The behavior of maximum likelihood estimates under nonstandard conditions. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, pp 221-233
[24] Javanmard, A; Montanari, A, Confidence intervals and hypothesis testing for high-dimensional regression, J Mach Learn Res, 15, 2869-2909, (2014) · Zbl 1319.62145
[25] Liu, RY; Singh, K, Efficiency and robustness in resampling, Ann Stat, 20, 370-384, (1992) · Zbl 0755.62038
[26] Liu, H; Yu, B, Asymptotic properties of lasso+mls and lasso+ridge in sparse high-dimensional linear regression, Electron J Stat, 7, 3124-3169, (2013) · Zbl 1281.62158
[27] Mammen, E, Bootstrap and wild bootstrap for high dimensional linear models, Ann Stat, 21, 255-285, (1993) · Zbl 0771.62032
[28] McKeague, IW; Qian, M, An adaptive resampling test for detecting the presence of significant predictors, J Am Stat Assoc, 110, 1422-1433, (2015) · Zbl 1373.62181
[29] Meier L, Dezeure R, Meinshausen N, Mächler M, Bühlmann P (2016) hdi: high-dimensional inference. R package version 0.1-6 · Zbl 1372.62023
[30] Meinshausen, N, Group bound: confidence intervals for groups of variables in sparse high dimensional regression without assumptions on the design, J R Stat Soc B, 77, 923-945, (2015)
[31] Meinshausen, N; Bühlmann, P, High-dimensional graphs and variable selection with the lasso, Ann Stat, 34, 1436-1462, (2006) · Zbl 1113.62082
[32] Meinshausen, N; Bühlmann, P, Stability selection (with discussion), J R Stat Soc B, 72, 417-473, (2010)
[33] Meinshausen, N; Meier, L; Bühlmann, P, P-values for high-dimensional regression, J Am Stat Assoc, 104, 1671-1681, (2009) · Zbl 1205.62089
[34] Meinshausen, N; Maathuis, MH; Bühlmann, P, Asymptotic optimality of the westfall-Young permutation procedure for multiple testing under dependence, Ann Stat, 39, 3369-3391, (2011) · Zbl 1246.62124
[35] Reid, S; Tibshirani, R; Friedman, J, A study of error variance estimation in lasso regression, Stat Sinica, 26, 35-67, (2016) · Zbl 1372.62023
[36] Rudelson, M; Zhou, S, Reconstruction from anisotropic random measurements, IEEE Trans Inf Theory, 59, 3434-3447, (2013) · Zbl 1364.94158
[37] Shah, R; Samworth, R, Variable selection with error control: another look at stability selection, J R Stat Soc B, 75, 55-80, (2013)
[38] Shah R, Bühlmann P (2015) Goodness of fit tests for high-dimensional linear models. J R Stat Soc B. doi:10.1111/rssb.12234
[39] Geer, S; Bühlmann, P; Zhou, S, The adaptive and the thresholded lasso for potentially misspecified models (and a lower bound for the lasso), Electron J Stat, 5, 688-749, (2011) · Zbl 1274.62471
[40] Geer, S; Bühlmann, P; Ritov, Y; Dezeure, R, On asymptotically optimal confidence regions and tests for high-dimensional models, Ann Stat, 42, 1166-1202, (2014) · Zbl 1305.62259
[41] Wasserman, L; Roeder, K, High dimensional variable selection, Ann Stat, 37, 2178-2201, (2009) · Zbl 1173.62054
[42] Westfall P, Young S (1993) Resampling-based multiple testing: examples and methods for P-value adjustment. Wiley, Hoboken · Zbl 0850.62368
[43] White, H, A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity, Econometrica, 48, 817-838, (1980) · Zbl 0459.62051
[44] Wu, C-FJ, Jackknife, bootstrap and other resampling methods in regression analysis, Ann Stat, 14, 1261-1295, (1986) · Zbl 0618.62072
[45] Ye, F; Zhang, C-H, Rate minimaxity of the lasso and Dantzig selector for the \(ℓ _q\) loss in \(ℓ _r\) balls, J Mach Learn Res, 11, 3481-3502, (2010)
[46] Zhang, C-H; Huang, J, The sparsity and bias of the lasso selection in high-dimensional linear regression, Ann Stat, 36, 1567-1594, (2008) · Zbl 1142.62044
[47] Zhang, C-H; Zhang, SS, Confidence intervals for low dimensional parameters in high dimensional linear models, J R Stat Soc B, 76, 217-242, (2014)
[48] Zhang, X; Cheng, G, Simultaneous inference for high-dimensional linear models, J Am Stat Assoc, (2016)
[49] Zhou, Q, Monte Carlo simulation for lasso-type problems by estimator augmentation, J Am Stat Assoc, 109, 1495-1516, (2014) · Zbl 1368.62214
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.