×

Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. (English) Zbl 1220.62095

Summary: A number of variable selection methods have been proposed involving nonconvex penalty functions. These methods, which include the smoothly clipped absolute deviation (SCAD) penalty and the minimax concave penalty (MCP), have been demonstrated to have attractive theoretical properties, but model fitting is not a straightforward task, and the resulting solutions may be unstable. We demonstrate the potential of coordinate descent algorithms for fitting these models, establishing theoretical convergence properties and demonstrating that they are significantly faster than competing approaches. In addition, we demonstrate the utility of convexity diagnostics to determine regions of the parameter space in which the objective function is locally convex, even though the penalty is not. Our simulation study and data examples indicate that nonconvex penalties like MCP and SCAD are worthwhile alternatives to the lasso in many applications. In particular, our numerical results suggest that MCP is the preferred approach among the three methods.

MSC:

62J99 Linear inference, regression
62J20 Diagnostics, and linear inference and regression
65C60 Computational problems in statistics (MSC2010)

Keywords:

lasso; SCAD; MCP; optimization

Software:

glmnet; S+WAVELETS
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Breiman, L. (1996). Heuristics of instability and stabilization in model selection. Ann. Statist. 24 2350-2383. · Zbl 0867.62055
[2] Bruce, A. G. \and Gao, H. Y. (1996). Understanding WaveShrink: Variance and bias estimation. Biometrika 83 727-745. JSTOR: · Zbl 0883.62038
[3] Donoho, D. L. \and Johnstone, J. M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika 81 425-455. JSTOR: · Zbl 0815.62019
[4] Efron, B., Hastie, T., Johnstone, I. \and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407-451. · Zbl 1091.62054
[5] Fan, J. \and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1361. JSTOR: · Zbl 1073.62547
[6] Friedman, J., Hastie, T. \and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. J. Statist. Softw. 33 1-22.
[7] Friedman, J., Hastie, T., Höfling, H. \and Tibshirani, R. (2007). Pathwise coordinate optimization. Ann. Appl. Statist. 1 302-332. · Zbl 1378.90064
[8] Gao, H. Y. \and Bruce, A. G. (1997). WaveShrink with firm shrinkage. Statist. Sinica 7 855-874. · Zbl 1067.62529
[9] Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A. et al. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286 531-536. · Zbl 1047.65504
[10] McCullagh, P. \and Nelder, J. A. (1989). Generalized Linear Models . Chapman and Hall/CRC, Boca Raton, FL. · Zbl 0744.62098
[11] Park, M. Y. \and Hastie, T. (2007). L 1 regularization path algorithm for generalized linear models. J. Roy. Statist. Soc. Ser. B 69 659-677.
[12] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. JSTOR: · Zbl 0850.62538
[13] Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109 475-494. · Zbl 1006.65062
[14] Wu, T. T. \and Lange, K. (2008). Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Statist. 2 224-244. · Zbl 1137.62045
[15] Yu, J., Yu, J., Almal, A. A., Dhanasekaran, S. M., Ghosh, D., Worzel, W. P. \and Chinnaiyan, A. M. (2007). Feature selection and molecular classification of cancer using genetic programming. Neoplasia 9 292-303.
[16] Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894-942. · Zbl 1183.62120
[17] Zou, H. \and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509-1533. · Zbl 1142.62027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.