Variable selection in finite mixture of regression models. (English) Zbl 1469.62306

Summary: In the applications of finite mixture of regression (FMR) models, often many covariates are used, and their contributions to the response variable vary from one component to another of the mixture model. This creates a complex variable selection problem. Existing methods, such as the Akaike information criterion and the Bayes information criterion, are computationally expensive as the number of covariates and components in the mixture model increases. In this article we introduce a penalized likelihood approach for variable selection in FMR models. The new method introduces penalties that depend on the size of the regression coefficients and the mixture structure. The new method is shown to be consistent for variable selection. A data-adaptive method for selecting tuning parameters and an EM algorithm for efficient numerical computations are developed. Simulations show that the method performs very well and requires much less computing power than existing methods. The new method is illustrated by analyzing two real data sets.


62J07 Ridge regression; shrinkage estimators (Lasso)
62J05 Linear regression; mixed models
62F07 Statistical ranking and selection procedures
Full Text: DOI Link