×

Estimation in high-dimensional linear models with deterministic design matrices. (English) Zbl 1273.62177

Summary: Because of the advance in technologies, modern statistical studies often encounter linear models with the number of explanatory variables much larger than the sample size. Estimation and variable selection in these high-dimensional problems with deterministic design points is very different from those in the case of random covariates, due to the identifiability of the high-dimensional regression parameter vector. We show that a reasonable approach is to focus on the projection of the regression parameter vector onto the linear space generated by the design matrix. In this work, we consider the ridge regression estimator of the projection vector and propose to threshold the ridge regression estimator when the projection vector is sparse in the sense that many of its components are small. The proposed estimator has an explicit form and is easy to use in application. Asymptotic properties such as the consistency of variable selection and estimation and the convergence rate of the prediction mean squared error are established under some sparsity conditions on the projection vector. A simulation study is also conducted to examine the performance of the proposed estimator.

MSC:

62J07 Ridge regression; shrinkage estimators (Lasso)
62F12 Asymptotic properties of parametric estimators
62G20 Asymptotic properties of nonparametric inference
62J05 Linear regression; mixed models
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. · Zbl 1073.62547
[2] Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 849-911.
[3] Fan, J. and Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statist. Sinica 20 101-148. · Zbl 1180.62080
[4] Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Ann. Statist. 32 928-961. · Zbl 1092.62031
[5] Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression, biased estimation for nonorthogonal problems. Technometrics 12 55-67. · Zbl 0202.17205
[6] Hunter, D. R. and Li, R. (2005). Variable selection using MM algorithms. Ann. Statist. 33 1617-1642. · Zbl 1078.62028
[7] Lin, C. D., Mukerjee, R. and Tang, B. (2009). Construction of orthogonal and nearly orthogonal Latin hypercubes. Biometrika 96 243-247. · Zbl 1161.62044
[8] McKay, M. D., Beckman, R. J. and Conover, W. J. (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21 239-245. · Zbl 0415.62011
[9] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082
[10] Meinshausen, N. and Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 246-270. · Zbl 1155.62050
[11] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[12] Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. J. Amer. Statist. Assoc. 104 1512-1524. · Zbl 1205.62103
[13] Wang, H., Li, R. and Tsai, C.-L. (2007). Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94 553-568. · Zbl 1135.62058
[14] Whittle, P. (1960). Bounds for the moments of linear and quadratic forms in independent variables. Theory Probab. Appl. 5 302-305. · Zbl 0101.12003
[15] Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567-1594. · Zbl 1142.62044
[16] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541-2563. · Zbl 1222.62008
[17] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326
[18] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 301-320. · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.