×

Perturbation bootstrap in adaptive Lasso. (English) Zbl 1420.62305

Summary: The Adaptive Lasso (Alasso) was proposed by H. Zou [J. Am. Stat. Assoc. 101, No. 476, 1418–1429 (2006; Zbl 1171.62326)] as a modification of the Lasso for the purpose of simultaneous variable selection and estimation of the parameters in a linear regression model. He established that the Alasso estimator is variable-selection consistent as well as asymptotically Normal in the indices corresponding to the nonzero regression coefficients in certain fixed-dimensional settings. In an influential paper [J. Am. Stat. Assoc. 106, No. 496, 1371–1382 (2011; Zbl 1323.62076)], J. Minnier et al. proposed a perturbation bootstrap method and established its distributional consistency for the Alasso estimator in the fixed-dimensional setting. In this paper, however, we show that this (naive) perturbation bootstrap fails to achieve second-order correctness in approximating the distribution of the Alasso estimator. We propose a modification to the perturbation bootstrap objective function and show that a suitably Studentized version of our modified perturbation bootstrap Alasso estimator achieves second-order correctness even when the dimension of the model is allowed to grow to infinity with the sample size. As a consequence, inferences based on the modified perturbation bootstrap will be more accurate than the inferences based on the oracle normal approximation. We give simulation studies demonstrating good finite-sample properties of our modified perturbation bootstrap method as well as an illustration of our method on a real data set.

MSC:

62J07 Ridge regression; shrinkage estimators (Lasso)
62G09 Nonparametric statistical resampling methods
62E20 Asymptotic distribution theory in statistics

Software:

hdi; glmnet
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Bhattacharya, R. N. and Ghosh, J. K. (1978). On the validity of the formal Edgeworth expansion. Ann. Statist.6 434-451. · Zbl 0396.62010 · doi:10.1214/aos/1176344134
[2] Bhattacharya, R. N. and Ranga Rao, R. (1986). Normal Approximation and Asymptotic Expansions. John Wiley & Sons, New York. · Zbl 0657.41001
[3] Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge. · Zbl 1058.90049
[4] Bühlmann, P., Kalisch, M. and Meier, L. (2014). High-dimensional statistics with a view towards applications in biology. Annu. Rev. Statist. Appl.1 255-278.
[5] Camponovo, L. (2015). On the validity of the pairs bootstrap for Lasso estimators. Biometrika102 981-987. · Zbl 1372.62021 · doi:10.1093/biomet/asv039
[6] Chatterjee, S. and Bose, A. (2005). Generalized bootstrap for estimating equations. Ann. Statist.33 414-436. · Zbl 1065.62073 · doi:10.1214/009053604000000904
[7] Chatterjee, A. and Lahiri, S. N. (2010). Asymptotic properties of the residual bootstrap for Lasso estimators. Proc. Amer. Math. Soc.138 4497-4509. · Zbl 1203.62014 · doi:10.1090/S0002-9939-2010-10474-4
[8] Chatterjee, A. and Lahiri, S. N. (2011). Bootstrapping Lasso estimators. J. Amer. Statist. Assoc.106 608-625. · Zbl 1232.62088 · doi:10.1198/jasa.2011.tm10159
[9] Chatterjee, A. and Lahiri, S. N. (2013). Rates of convergence of the adaptive LASSO estimators to the oracle distribution and higher order refinements by the bootstrap. Ann. Statist.41 1232-1259. · Zbl 1293.62153 · doi:10.1214/13-AOS1106
[10] Das, D., Gregory, K. and Lahiri, S. N. (2019). Supplement to “Perturbation bootstrap in adaptive Lasso.” DOI:10.1214/18-AOS1741SUPP. · Zbl 1420.62305
[11] Das, D. and Lahiri, S. N. (2019). Second order correctness of perturbation bootstrap M-estimator of multiple linear regression parameter. Bernoulli25 654-682. · Zbl 1442.62091 · doi:10.3150/17-BEJ1001
[12] Dezeure, R., Bühlmann, P. and Zhang, C.-H. (2017). High-dimensional simultaneous inference with the bootstrap. TEST26 685-719. Available at arXiv:1606.03940. · Zbl 06833591 · doi:10.1007/s11749-017-0554-2
[13] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc.96 1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[14] Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw.33 1-22.
[15] Fuk, D. H. and Nagaev, S. V. (1971). Probabilistic inequalities for sums of independent random variables. Teor. Veroyatn. Primen.16 660-675. · Zbl 0259.60024 · doi:10.1137/1116071
[16] Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer, New York. · Zbl 0744.62026
[17] Hall, P., Lee, Y. K., Park, B. U. and Paul, D. (2009). Tie-respecting bootstrap methods for estimating distributions of sets and functions of eigenvalues. Bernoulli15 380-401. · Zbl 1200.62043 · doi:10.3150/08-BEJ154
[18] Jin, Z., Ying, Z. and Wei, L. J. (2001). A simple resampling method by perturbing the minimand. Biometrika88 381-390. · Zbl 0984.62033 · doi:10.1093/biomet/88.2.381
[19] Knight, K. and Fu, W. (2000). Asymptotics for Lasso-type estimators. Ann. Statist.28 1356-1378. · Zbl 1105.62357 · doi:10.1214/aos/1015957397
[20] Minnier, J., Tian, L. and Cai, T. (2011). A perturbation method for inference on regularized regression estimates. J. Amer. Statist. Assoc.106 1371-1382. · Zbl 1323.62076 · doi:10.1198/jasa.2011.tm10382
[21] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B58 267-288. · Zbl 0850.62538 · doi:10.1111/j.2517-6161.1996.tb02080.x
[22] Turnbull, H. W. (1930). A matrix form of Taylor’s theorem. Proc. Edinb. Math. Soc. (2) 33 33-54. · doi:10.1017/S0013091500007537
[23] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist.42 1166-1202. · Zbl 1305.62259 · doi:10.1214/14-AOS1221
[24] Wang, X. and Song, L. (2011). Adaptive Lasso variable selection for the accelerated failure models. Comm. Statist. Theory Methods40 4372-4386. · Zbl 1239.62129 · doi:10.1080/03610926.2010.513785
[25] Zhang, C.-H. and Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. J. Roy. Statist. Soc. Ser. B76 217-242. · Zbl 1411.62196
[26] Zhou, Q. M., Song, P. X.-K. and Thompson, M. E. (2012). Information ratio test for model misspecification in quasi-likelihood inference. J. Amer. Statist. Assoc.107 205-213. · Zbl 1261.62052 · doi:10.1080/01621459.2011.645785
[27] Zou, H. (2006). The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc.101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.