Regularization in finite mixture of regression models with diverging number of parameters. (English) Zbl 1273.62254

Summary: Feature (variable) selection has become a fundamentally important problem in recent statistical literature. Sometimes, in applications, many variables are introduced to reduce possible modeling biases, but the number of variables a model can accommodate is often limited by the amount of data available. In other words, the number of variables considered depends on the sample size, which reflects the estimability of the parametric model. We consider the problem of feature selection in finite mixture of regression models when the number of parameters in the model can increase with the sample size. We propose a penalized likelihood approach for feature selection in these models. Under certain regularity conditions, our approach leads to consistent variable selection. We carry out extensive simulation studies to evaluate the performance of the proposed approach under controlled settings. We also applied the proposed method to two real data. The first is on telemonitoring of Parkinson’s disease (PD), where the problem concerns whether dysphonic features extracted from the patients’ speech signals recorded at home can be used as surrogates to study PD severity and progression. The second is on breast cancer prognosis, in which one is interested in assessing whether cell nuclear features may offer prognostic values on long-term survival of breast cancer patients. Our analysis in each of the application revealed a mixture structure in the study population and uncovered a unique relationship between the features and the response variable in each of the mixture component.


62P10 Applications of statistics to biology and medical sciences; meta analysis
92C50 Medical applications (general)
65C60 Computational problems in statistics (MSC2010)
Full Text: DOI


[1] Fan, Variable selection via non-concave penalized likelihood and its oracle properties, Journal of American Statistical Association 96 pp 1348– (2001) · Zbl 1073.62547
[2] Fan, A selective overview of variable selection in high dimensional feature space, Statistica Sinica 20 pp 101– (2010) · Zbl 1180.62080
[3] Fan, Non-concave penalized likelihood with a diverging number of parameters, The Annals of Statistics 32 pp 928– (2004) · Zbl 1092.62031
[4] Hennig, Identifiability of models for clusterwise linear regression, Journal of Classification 17 pp 273– (2000) · Zbl 1017.62058
[5] Hoeral, Ridge regression: Application to non-orthogonal problems, Technometrics 1 pp 69– (1970)
[6] Huber, Robust regression: Asymptotics, conjectures and Monte Carlo, Annals of Statistics 1 pp 799– (1973) · Zbl 0289.62033
[7] Jacobs, Adaptive mixture of local experts, Neural Computation 3 pp 79– (1991)
[8] Kerbin, Consistent estimation of the order of mixture models, Sankhya, Series A 62 pp 49– (2000)
[9] Khalili, Variable selection in finite mixture of regression models, Journal of American Statistical Association 102 pp 1025– (2007) · Zbl 1469.62306
[10] Khalili, Feature selection in finite mixture of sparse normal linear models in high-dimensional feature space, Biostatistics 12 pp 156– (2011)
[11] Leroux, Consistent estimation of a mixing distribution, Annals of Statistics 20 pp 1350– (1992) · Zbl 0763.62015
[12] Little, Suitability of dysphonia measurements for telemonitoring of Parkinsons disease, IEEE Transactions on Biomedical Engineering 56 pp 1015– (2009)
[13] McLachlan, On the role of finite mixture models in survival analysis, Statistical Methods in Medical Research 3 pp 211– (1994)
[14] McLachlan , G. J. Peel , D. 2000 Finite Mixture Models
[15] Stadler, l1-penalization for mixture regression models (with discussion), Test 19 pp 209– (2010) · Zbl 1203.62128
[16] Sun, A statistical framework for eQTL mapping using RNA-seq data, Biometrics 68 pp 1– (2012) · Zbl 1241.62166
[17] Tibshirani, Regression shrinkage and selection via lasso, Journal of the Royal Statistical Society, Series B 58 pp 267– (1996) · Zbl 0850.62538
[18] Tsanas, Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests, IEEE Transactions on Biomedical Engineering 57 pp 884– (2010)
[19] van de Geer, High-dimensional generalized linear models and the Lasso, The Annals of Statistics 36 pp 614– (2010) · Zbl 1138.62323
[20] Wolberg, Image analysis and machine learning applied to breast cancer diagnosis and prognosis, Analytical and Quantitative Cytology and Histology 17 pp 77– (1995)
[21] Zhang, Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics 38 pp 894– (2010) · Zbl 1183.62120
[22] Zou, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society, Series B 67 pp 301– (2005) · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.