×

Improved variable selection with forward-lasso adaptive shrinkage. (English) Zbl 1220.62089

Summary: Recently, considerable interest has focused on variable selection methods in regression situations where the number of predictors, \(p\), is large relative to the number of observations, \(n\). Two commonly applied variable selection approaches are the Lasso, which computes highly shrunk regression coefficients, and Forward Selection, which uses no shrinkage. We propose a new approach, “Forward-Lasso Adaptive SHrinkage” (FLASH), which includes the Lasso and Forward Selection as special cases, and can be used in both the linear regression and the Generalized Linear Model domains. As with the Lasso and Forward Selection, FLASH iteratively adds one variable to the model in a hierarchical fashion but, unlike these methods, at each step adjusts the level of shrinkage so as to optimize the selection of the next variable. We first present FLASH in the linear regression setting and show that it can be fitted using a variant of the computationally efficient LARS algorithm. Then, we extend FLASH to the GLM domain and demonstrate, through numerous simulations and real world data sets, as well as some theoretical analysis, that FLASH generally outperforms many competing approaches.

MSC:

62J05 Linear regression; mixed models
62J12 Generalized linear models (logistic models)
65C60 Computational problems in statistics (MSC2010)
62J07 Ridge regression; shrinkage estimators (Lasso)

Software:

glmnet
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n (with discussion). Ann. Statist. 35 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523
[2] Efron, B., Hastie, T., Johnston, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407-451. · Zbl 1091.62054 · doi:10.1214/009053604000000067
[3] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. JSTOR: · Zbl 1073.62547 · doi:10.1198/016214501753382273
[4] Frank, I. E. and Friedman, J. H. (1993). A statistical view of some chemometrics regression tools. Technometrics 35 109-135. · Zbl 0775.62288 · doi:10.2307/1269656
[5] Friedman, J., Hastie, T., Hoefling, H. and Tibshirani, R. (2007). Pathwise coordinate optimization. Ann. Appl. Statist. 1 302-332. · Zbl 1378.90064 · doi:10.1214/07-AOAS131
[6] Friedman, J., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J. Statist. Software 33 1.
[7] Huang, S., Ma, S. and Zhang, C.-H. (2008). Adaptive lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603-1618. · Zbl 1255.62198
[8] Hwang, W., Zhang, H. and Ghosal, S. (2009). First: Combining forward iterative selection and shrinkage in high dimensional sparse linear regression. Stat. Interface 2 341-348. · Zbl 1245.62086
[9] James, G. M. and Radchenko, P. (2009). A generalized Dantzig selector with shrinkage tuning. Biometrika 96 323-337. · Zbl 1163.62054 · doi:10.1093/biomet/asp013
[10] Meinshausen, N. (2007). Relaxed lasso. Comput. Statist. Data Anal. 52 374-393. · Zbl 1452.62522
[11] Park, M. and Hastie, T. (2007). An L1 regularization-path algorithm for generalized linear models. J. Roy. Statist. Soc. Ser. B 69 659-677. · doi:10.1111/j.1467-9868.2007.00607.x
[12] Radchenko, P. and James, G. M. (2008). Variable inclusion and shrinkage algorithms. J. Amer. Statist. Assoc. 103 1304-1315. · Zbl 1205.62100 · doi:10.1198/016214508000000481
[13] Radchenko, P. and James, G. M. (2010). Supplement to “Forward-LASSO adaptive shrinkage.” DOI: .
[14] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. JSTOR: · Zbl 0850.62538
[15] Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparcity recovery using \ell 1 -constrained quadratic programming (lasso). IEEE Trans. Inform. Theory 55 2183-2202. · Zbl 1367.62220 · doi:10.1109/TIT.2009.2016018
[16] Zhao, P. and Yu, B. (2006). On model selection consistency of lasso. J. Mach. Learn. Res. 7 2541-2563. · Zbl 1222.62008
[17] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
[18] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. Ser. B 67 301-320. JSTOR: · Zbl 1069.62054 · doi:10.1111/j.1467-9868.2005.00503.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.