×

Inference without compatibility: using exponential weighting for inference on a parameter of a linear model. (English) Zbl 1473.62251

The authors consider a partially linear model \(Y = X\beta + \mu + \epsilon\) with \(n\)-dimensional \(Y\) and \(q\)-dimensional \(\beta\). The parameter \(\beta\) is assumed to be low-dimensional, \(q<n\), but the random nuisance \(\mu\) is modeled by \(\mu = Z\gamma\) with a \(p\)-dimensional sparse vector \(\gamma\), where \(p\) is high-dimensional, \(p>n\). The aim of the authors is inference on \(\beta\), on the signal strength \(\sigma^2_\mu\) and on the noise variance \(\sigma^2_\epsilon\). Since the Lasso (least absolute shrinkage and selection operator) forms the basis for the procedure, certain assumptions must be made to ensure nice properties. Especially the so-called compatibility condition seems to be indispensible. The main contribution of this paper is proving that the compatibility condition is not necessary for the statistical problem. The authors are able to construct \(\sqrt n\)-consistent estimators for \(\beta\), \(\sigma^2_\mu\) and \(\sigma^2_\epsilon\). Their approach involves using exponential weighing to aggregate over all models of particular size. This is a computationally hard problem, but can be well approximated. To this end, the authors present an algorithm and simulation results which allow comparison with other estimators.

MSC:

62J07 Ridge regression; shrinkage estimators (Lasso)
62G05 Nonparametric estimation
62-08 Computational methods for problems pertaining to statistics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bellec, P.C. (2018). The noise barrier and the large signal bias of the lasso and other convex estimators. arXiv preprint arXiv:1804.01230.
[2] Bickel, P.J., Klaassen, C.A.J., Ritov, Y. and Wellner, J.A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Series in the Mathematical Sciences. Baltimore, MD: Johns Hopkins Univ. Press.
[3] Bradic, J., Claeskens, G. and Gueuning, T. (2020). Fixed effects testing in high-dimensional linear mixed models. J. Amer. Statist. Assoc. 1-16. · Zbl 1452.62491
[4] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Series in Statistics. Heidelberg: Springer. · Zbl 1273.62015
[5] Cai, T.T. and Guo, Z. (2020). Semisupervised inference for explained variance in high dimensional linear regression and its applications. J. R. Stat. Soc. Ser. B. Stat. Methodol. 82 391-419.
[6] Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W. and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. Econom. J. 21 C1-C68.
[7] Dezeure, R., Bühlmann, P., Meier, L. and Meinshausen, N. (2015). High-dimensional inference: Confidence intervals, \(p\)-values and R-software hdi. Statist. Sci. 30 533-558. · Zbl 1426.62183
[8] Dicker, L.H. (2014). Variance estimation in high-dimensional linear models. Biometrika 101 269-284. · Zbl 1452.62495
[9] Fan, J., Guo, S. and Hao, N. (2012). Variance estimation using refitted cross-validation in ultrahigh dimensional regression. J. R. Stat. Soc. Ser. B. Stat. Methodol. 74 37-65. · Zbl 1411.62199
[10] Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 849-911. · Zbl 1411.62187
[11] Godsil, C. and Royle, G. (2001). Algebraic Graph Theory. Graduate Texts in Mathematics 207. New York: Springer.
[12] Grünwald, P. and van Ommen, T. (2017). Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it. Bayesian Anal. 12 1069-1103. · Zbl 1384.62088
[13] Hsu, D., Kakade, S.M. and Zhang, T. (2012). A tail inequality for quadratic forms of subgaussian random vectors. Electron. Commun. Probab. 17 no. 52, 6. · Zbl 1309.60017
[14] Janson, L., Barber, R.F. and Candès, E. (2017). EigenPrism: Inference for high dimensional signal-to-noise ratios. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 1037-1065. · Zbl 1373.62355
[15] Javanmard, A. and Montanari, A. (2014). Confidence intervals and hypothesis testing for high-dimensional regression. J. Mach. Learn. Res. 15 2869-2909. · Zbl 1319.62145
[16] Javanmard, A. and Montanari, A. (2018). Debiasing the lasso: Optimal sample size for Gaussian designs. Ann. Statist. 46 2593-2622. · Zbl 1407.62270
[17] Law, M. and Ritov, Y. (2021). Supplement to “Inference without compatibility: Using exponential weighting for inference on a parameter of a linear model.”
[18] Lee, J.D., Sun, D.L., Sun, Y. and Taylor, J.E. (2016). Exact post-selection inference, with application to the lasso. Ann. Statist. 44 907-927. · Zbl 1341.62061
[19] Leung, G. and Barron, A.R. (2006). Information theory and mixing least-squares regressions. IEEE Trans. Inf. Theory 52 3396-3410. · Zbl 1309.94051
[20] Li, S., Cai, T.T. and Li, H. (2019). Inference for high-dimensional linear mixed-effects models: A quasi-likelihood approach. arXiv preprint arXiv:1907.06116.
[21] Raskutti, G., Wainwright, M.J. and Yu, B. (2010). Restricted eigenvalue properties for correlated Gaussian designs. J. Mach. Learn. Res. 11 2241-2259. · Zbl 1242.62071
[22] Reid, S., Tibshirani, R. and Friedman, J. (2016). A study of error variance estimation in Lasso regression. Statist. Sinica 26 35-67. · Zbl 1372.62023
[23] Rigollet, P. and Tsybakov, A. (2011). Exponential screening and optimal rates of sparse estimation. Ann. Statist. 39 731-771. · Zbl 1215.62043
[24] Sun, T. and Zhang, C.-H. (2012). Scaled sparse linear regression. Biometrika 99 879-898. · Zbl 1452.62515
[25] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[26] van de Geer, S.A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3 1360-1392. · Zbl 1327.62425
[27] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2014). On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Statist. 42 1166-1202. · Zbl 1305.62259
[28] Zhang, X. and Cheng, G. (2017). Simultaneous inference for high-dimensional linear models. J. Amer. Statist. Assoc. 112 757-768.
[29] Zhang, Y., Wainwright, M.J. and Jordan, M.I. (2014). Lower bounds on the performance of polynomial-time algorithms for sparse linear regression. In Conference on Learning Theory 921-948.
[30] Zhang, C.-H. and Zhang, S.S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 217-242 · Zbl 1411.62196
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.