A globally convergent algorithm for Lasso-penalized mixture of linear regression models. (English) Zbl 1469.62109

Summary: Variable selection is an old and pervasive problem in regression analysis. One solution is to impose a lasso penalty to shrink parameter estimates toward zero and perform continuous model selection. The lasso-penalized mixture of linear regressions model (L-MLR) is a class of regularization methods for the model selection problem in the fixed number of variables setting. A new algorithm is proposed for the maximum penalized-likelihood estimation of the L-MLR model. This algorithm is constructed via the minorization-maximization algorithm paradigm. Such a construction allows for coordinate-wise updates of the parameter components, and produces globally convergent sequences of estimates that generate monotonic sequences of penalized log-likelihood values. These three features are missing in the previously presented approximate expectation-maximization algorithms. The previous difficulty in producing a globally convergent algorithm for the maximum penalized-likelihood estimation of the L-MLR model is due to the intractability of finding exact updates for the mixture model mixing proportions in the maximization-step. This issue is resolved by showing that it can be converted into a simple numerical root finding problem that is proven to have a unique solution. The method is tested in simulation and with an application to Major League Baseball salary data from the 1990s and the present day, where the concept of whether player salaries are associated with batting performance is investigated.


62-08 Computational methods for problems pertaining to statistics
62J05 Linear regression; mixed models
62J07 Ridge regression; shrinkage estimators (Lasso)
62P99 Applications of statistics


R; BRENT; flexmix; MASS (R)
Full Text: DOI arXiv


[1] 1000 Genomes Project Consortium, An integrated map of genetic variation from 1092 human genomes, Nature, 491, 56-65, (2012)
[2] Abrams, R., The Money Pitch: Baseball Free Agency and Salary Arbitration, (2010), Temple University Press Philadelphia
[3] Amemiya, T., Advanced Econometrics, (1985), Harvard University Press Cambridge
[4] Breiman, L., Statistical modeling: the two cultures (with comments and a rejoinder by the author), Statist. Sci., 16, 3, 199-231, (2001) · Zbl 1059.62505
[5] Brent, R. P., Algorithms for Minimization Without Derivatives, (2002), Dover Publications Mineola, New York · Zbl 1009.90133
[6] Brown, D. T.; Link, C. R.; Rubin, S. L., Moneyball after 10 years: how have major league baseball salaries adjusted?, J. Sports Econ., 1-16, (2015)
[7] Bühlmann, P.; van de Geer, S., Statistics for High-Dimensional Data, (2011), Springer New York · Zbl 1273.62015
[8] Chen, J.; Tan, X.; Zhang, R., Inference for normal mixtures in mean and variance, Statist. Sinica, 443-465, (2008) · Zbl 1135.62018
[9] Compiani, G.; Kitamura, Y., Using mixtures in econometric models: a brief review and some new results, Econom. J., 19, 3, (2016)
[10] de Leeuw, J.; Lange, K., Sharp quadratic majorization in one dimension, Comput. Statist. Data Anal., 53, 7, 2471-2484, (2009) · Zbl 1453.62078
[11] Dempster, A. P.; Laird, N. M.; Rubin, D. B., Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B Stat. Methodol., 39, 1-38, (1977) · Zbl 0364.62022
[12] DeSarbo, W. S.; Cron, W. L., A maximum likelihood methodology for clusterwise linear regressions, J. Classification, 5, 249-282, (1988) · Zbl 0692.62052
[13] De Veaux, R. D., Mixtures of linear regressions, Comput. Statist. Data Anal., 8, 227-245, (1989) · Zbl 0726.62109
[14] Fullerton Jr., T. M.; Peach, J. T., Major league baseball 2015, what a difference a year makes, Appl. Econ. Lett., 1-5, (2016)
[15] George, E. I., The variable selection problem, J. Amer. Statist. Assoc., 95, 1304-1308, (2000) · Zbl 1018.62050
[16] Greene, W. H., Econometric Analysis, (2003), Prentice Hall
[17] Grün, B.; Leisch, F., Fitting finite mixtures of generalized linear regressions in R, Comput. Statist. Data Anal., 51, 5247-5252, (2007) · Zbl 1445.62192
[18] Hakes, J. K.; Sauer, R. D., An economic evaluation of the moneyball hypothesis, J. Econ. Perspect., 20, 173-185, (2006)
[19] Hastie, T.; Tibshirani, R.; Friedman, J., The Elements Of Statistical Learning, (2009), Springer New York
[20] Hathaway, R. J., A constrained formulation of maximum-likelihood estimation for normal mixture distributions, Ann. Statist., 13, 795-800, (1985) · Zbl 0576.62039
[21] Hennig, C., Identifiability of models for clusterwise linear regression, J. Classification, 17, 273-296, (2000) · Zbl 1017.62058
[22] Hui, F. K.; Warton, D. I.; Foster, S. D., Multi-species distribution modeling using penalized mixture of regressions, Ann. Appl. Stat., 9, 866-882, (2015) · Zbl 1397.62263
[23] Hunter, D. R.; Li, R., Variable selection using MM algorithms, Ann. Statist., 33, (2005) · Zbl 1078.62028
[24] Ingrassia, S.; Rocci, R., Constrained monotone EM algorithms for finite mixture of multivariate gaussians, Comput. Statist. Data Anal., 51, 5339-5351, (2007) · Zbl 1445.62116
[25] Ingrassia, S.; Rocci, R., Degeneracy of the EM algorithm for the MLE of multivariate Gaussian mixtures and dynamic constraints, Comput. Statist. Data Anal., 55, 1714-1725, (2011) · Zbl 1328.65030
[26] Izenman, A. J., Modern Multivariate Statistical Techniques, (2008), Springer New York
[27] Jiang, W.; Tanner, M. A., On the identifiability of mixture-of-experts, Neural Netw., 12, 1253-1258, (1999)
[28] Jones, P. N.; McLachlan, G. J., Fitting finite mixture models in a regression context, Austral. J. Statist., 34, 233-240, (1992)
[29] Khalili, A., New estimation and feature selection methods in mixture-of-experts models, Canad. J. Statist., 38, 519-539, (2010) · Zbl 1349.62071
[30] Khalili, A., An overview of the new feature selection methods in finite mixture of regression models, J. Iranian Stat. Soc., 10, 201-235, (2011) · Zbl 1244.62021
[31] Khalili, A.; Chen, J., Variable selection in finite mixture of regression models, J. Amer. Statist. Assoc., 102, 1025-1038, (2007) · Zbl 1469.62306
[32] Khalili, A.; Lin, S., Regularization in finite mixture of regression models with diverging number of parameters, Biometrics, 69, 436-446, (2013) · Zbl 1273.62254
[33] Khan, A. A.; Tammer, C.; Zalinescu, C., Set-valued optimization, (2015), Springer, Heidelberg · Zbl 1308.49004
[34] Kiefer, J., Sequential minimax search for a maximum, Proc. Amer. Math. Soc., 4, 502-506, (1953) · Zbl 0050.35702
[35] Lange, K., Optimization, (2013), Springer New York · Zbl 1273.90002
[36] Lewis, M., Moneyball: The Art of Winning an Unfair Game, (2004), WW Norton & Company New York
[37] Major League Baseball Players Association, Agreement 2012-2016, between major league baseball and the major league baseball players association, Attachment, 46, 265-276, (2011)
[38] Marchi, M.; Albert, J., Analyzing Baseball Data with R, (2013), CRC Press Boca Raton, Florida
[39] McLachlan, G. J.; Peel, D., Finite Mixture Models, (2000), Wiley New York · Zbl 0963.62061
[40] Nelder, J. A.; Mead, R., A simplex algorithm for functional minimization, Comput. J., 7, 308-313, (1965) · Zbl 0229.65053
[41] Nguyen, H. D.; McLachlan, G. J., Maximum likelihood estimation of Gaussian mixture models without matrix operations, Adv. Data Anal. Classif., 9, 371-394, (2015)
[42] Nguyen, H. D.; McLachlan, G. J., Laplace mixture of linear experts, Comput. Stat. Data Anal., 93, 177-191, (2016) · Zbl 1468.62147
[43] Quandt, R. E., A new approach to estimating switching regressions, J. Amer. Statist. Assoc., 67, 306-310, (1972) · Zbl 0237.62047
[44] Razaviyayn, M.; Hong, M.; Luo, Z.-Q., A unified convergence analysis of block successive minimization methods for nonsmooth optimization, SIAM J. Optim., 23, 1126-1153, (2013) · Zbl 1273.90123
[45] R: A Language and Environment for Statistical Computing, (2015), R Foundation for Statistical Computing Vienna, Austria, URL https://www.R-project.org/
[46] Schwarz, G., Estimating the dimensions of a model, Ann. Statist., 6, 461-464, (1978) · Zbl 0379.62005
[47] Scully, G. W., Pay and performance in major league baseball, Amer. Econ. Rev., 64, 915-930, (1974)
[48] Sriperumbudur, B. K.; Lanckriet, G. R., A proof of convergence of the concave-convex procedure using zangwill’s theory, Neural Comput., 24, 6, 1391-1407, (2012) · Zbl 1254.90180
[49] Stadler, N.; Bühlmann, P.; van de Geer, S., \(l_1\)-penalization for mxture regression models, TEST, 19, 209-256, (2010) · Zbl 1203.62128
[50] Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B Stat. Methodol., 58, 267-288, (1996) · Zbl 0850.62538
[51] Venables, W. N.; Ripley, B. D., Modern Applied Statistics with S, (2002), Springer New York, URL http://www.stats.ox.ac.uk/pub/MASS4 · Zbl 1006.62003
[52] White, H., Maximum likelihood estimation of misspecified models, Econometrica, 1-25, (1982) · Zbl 0478.62088
[53] Wu, T. T.; Lange, K., Coordinate descent algorithms for LASSO penalized regression, Ann. Appl. Stat., 2, 224-244, (2008) · Zbl 1137.62045
[54] Yuan, M.; Lin, Y., Model selection and estimation in regression with grouped variables, J. R. Stat. Soc. Ser. B Stat. Methodol., 68, 49-67, (2006) · Zbl 1141.62030
[55] Zhou, H.; Lange, K., MM algorithms for some discrete multivariate distributions, J. Comput. Graph. Statist., 19, 645-665, (2010)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.