zbMATH — the first resource for mathematics

Robust mixture regression modeling based on scale mixtures of skew-normal distributions. (English) Zbl 1342.62113
Summary: The traditional estimation of mixture regression models is based on the assumption of normality (symmetry) of component errors and thus is sensitive to outliers, heavy-tailed errors and/or asymmetric errors. In this work we present a proposal to deal with these issues simultaneously in the context of the mixture regression by extending the classic normal model by assuming that the random errors follow a scale mixtures of skew-normal distributions. This approach allows us to model data with great flexibility, accommodating skewness and heavy tails. The main virtue of considering the mixture regression models under the class of scale mixtures of skew-normal distributions is that they have a nice hierarchical representation which allows easy implementation of inference. We develop a simple EM-type algorithm to perform maximum likelihood inference of the parameters of the proposed model. In order to examine the robust aspect of this flexible model against outlying observations, some simulation studies are also presented. Finally, a real data set is analyzed, illustrating the usefulness of the proposed method.

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62F10 Point estimation
62F35 Robustness and adaptive procedures (parametric inference)
62J05 Linear regression; mixed models
Full Text: DOI
[1] Andrews, DF; Mallows, CL, Scale mixtures of normal distributions, J R Stat Soc Ser B, 36, 99-102, (1974) · Zbl 0282.62017
[2] Arellano-Valle, RB; Castro, LM; Genton, MG; Gómez, HW, Bayesian inference for shape mixtures of skewed distributions, with application to regression analysis, Bayesian Anal, 3, 513-539, (2008) · Zbl 1330.62242
[3] Azzalini, A, A class of distributions which includes the normal ones, Scand J Stat, 12, 171-178, (1985) · Zbl 0581.62014
[4] Azzalini, A; Capitanio, A, Distributions generated and perturbation of symmetry with emphasis on the multivariate skew-t distribution, J R Stat Soc Ser B, 61, 367-389, (2003) · Zbl 1065.62094
[5] Bai, X; Yao, W; Boyer, JE, Robust Fitting of mixture regression models, Comput Stat Data Anal, 56, 2347-2359, (2012) · Zbl 1252.62011
[6] Basso, RM; Lachos, VH; Cabral, CRB; Ghosh, P, Robust mixture modeling based on scale mixtures of skew-normal distributions, Comput Stat Data Anal, 54, 2926-2941, (2010) · Zbl 1284.62193
[7] Böhning D (2000) Computer-assisted analysis of mixtures and applications. Meta-analysis, disease mapping and others. Chapman&Hall/CRC, Boca Raton · Zbl 0951.62088
[8] Böhning, D; Seidel, W; Alfó, M; Garel, B; Patilea, V; Walther, G, Editorial: advances in mixture models, Comput Stat Data Anal, 51, 5205-5210, (2007) · Zbl 1445.00012
[9] Böhning, D; Hennig, C; McLachlan, GJ; McNicholas, PD, Editorial: the 2nd special issue on advances in mixture models, Comput Stat Data Anal, 71, 1-2, (2014) · Zbl 06975367
[10] Branco, MD; Dey, DK, A general class of multivariate skew-elliptical distributions, J Multivar Anal, 79, 99-113, (2001) · Zbl 0992.62047
[11] Cabral, CRB; Lachos, VH; Prates, MO, Multivariate mixture modeling using skew-normal independent distributions, Comput Stat Data Anal, 56, 126-142, (2012) · Zbl 1239.62058
[12] Celeux, G; Chauveau, D; Diebolt, J, Stochastic versions of the EM algorithm: an experimental study in the mixture case, J Stat Comput Simul, 55, 287-314, (1996) · Zbl 0907.62024
[13] Celeux, G; Hurn, M; Robert, CP, Computational and inferential difficulties with mixture posterior distributions, J Am Stat Assoc, 95, 957-970, (2000) · Zbl 0999.62020
[14] Chen, J; Tan, X; Zhang, R, Inference for normal mixture in Mean and variance, Stat Sin, 18, 443-465, (2008) · Zbl 1135.62018
[15] Cohen, E, Some effects of inharmonic partials on interval perception, Music Percept, 1, 323-349, (1984)
[16] Cosslett, SR; Lee, LF, Serial correlation in latent discrete variable models, J Econ, 27, 79-97, (1985)
[17] Dempster, A; Laird, N; Rubin, D, Maximum likelihood from incomplete data via the EM algorithm, J Roy Stat Soc Ser B, 39, 1-38, (1977) · Zbl 0364.62022
[18] Depraetere, N; Vandebroek, M, Order selection in finite mixtures of linear regressions, Stat Pap, 55, 871-911, (2014) · Zbl 1334.62138
[19] DeSarbo, WS; Cron, WL, A maximum likelihood methodology for clusterwise linear regression, J Classif, 5, 248-282, (1988) · Zbl 0692.62052
[20] DeSarbo, WS; Wedel, M; Vriens, M; Ramaswamy, V, Latent class metric conjoint analysis, Market Lett, 3, 273-288, (1992)
[21] DeVeaux, RD, Mixtures of linear regressions, Comput Stat Data Anal, 8, 227-245, (1989) · Zbl 0726.62109
[22] Fraley, C; Raftery, AE, Model-based clustering, discriminant analysis, and density estimation, J Am Stat Assoc, 97, 611-631, (2002) · Zbl 1073.62545
[23] Frühwirth-Schnatter S (2006) Finite mixture and Markov switching models. Springer, New York · Zbl 1108.62002
[24] Galimberti, G; Soffritti, G, A multivariate linear regression analysis using finite mixtures of t distributions, Comput Stat Data Anal, 71, 138-150, (2014) · Zbl 06975378
[25] Hamilton, JD, A new approach to the economic analysis of nonstationary time series and the business cycle, Econ J Econ Soc, 57, 357-384, (1989) · Zbl 0685.62092
[26] Hathaway, RJ, A constrained formulation of maximum-likelihood estimation for normal mixture distributions, Ann Stat, 13, 795-800, (1985) · Zbl 0576.62039
[27] Hathaway, RJ, A constrained EM algorithm for univariate mixtures, J Stat Comput Simul, 23, 211-230, (1986)
[28] Hunter, DR; Young, DS, Semiparametric mixtures of regressions, J Nonparametr Stat, 24, 19-38, (2012) · Zbl 1241.62055
[29] Lachos, VH; Ghosh, P; Arellano-Valle, RB, Likelihood based inference for skew-normal independent linear mixed models, Stat Sin, 20, 303-322, (2010) · Zbl 1186.62071
[30] Lee, G; Scott, C, EM algorithms for multivariate Gaussian mixture models with truncated and censored data, Comput Stat Data Anal, 56, 2816-2829, (2012) · Zbl 1255.62308
[31] Lee, S; McLachlan, GJ, Finite mixtures of multivariate skew t-distributions: some recent and new results, Stat Comput, 24, 181-202, (2014) · Zbl 1325.62107
[32] Lin, TC; Lin, TI, Supervised learning of multivariate skew normal mixture models with missing information, Comput Stat, 25, 183-201, (2010) · Zbl 1223.62088
[33] Lin, TI; Lee, JC; Hsieh, WJ, Robust mixture modeling using the skew t distribution, Stat Comput, 17, 81-92, (2007)
[34] Lindsay BG (1995) Mixture models: theory geometry and applications, vol 51. In: NSF-CBMS regional conference series in probability and statistics, Institute of Mathematical Statistics, Hayward
[35] Liu, C; Rubin, DB, The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence, Biometrika, 80, 267-278, (1994) · Zbl 0812.62028
[36] Liu, M; Lin, TI, A skew-normal mixture regression model, Educ Psychol Meas, 74, 139-162, (2014)
[37] Liu, M; Hancock, GR; Harring, JR, Using finite mixture modeling to deal with systematic measurement error: a case study, J Mod Appl Stat Methods, 10, 249-261, (2011)
[38] Lo, K; Gottardo, R, Flexible mixture modeling via the multivariate t distribution with the box-Cox transformation: an alternative to the skew-t distribution, Stat Comput, 22, 33-52, (2012) · Zbl 1322.62173
[39] McLachlan GJ, Krishnan T (2008) The EM algorithm and extensions. Wiley, New Jersey · Zbl 1165.62019
[40] McLachlan GJ, Peel D (1998) Robust cluster analysis via mixtures of multivariate t-distributions. In: Amin A, Dori D, Pudil P, Freeman H (eds) Lecture notes in computer science, vol 1451, pp 658-666 · Zbl 0237.62047
[41] McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York · Zbl 0963.62061
[42] Meng, X; Rubin, DB, Maximum likelihood estimation via the ECM algorithm: a general framework, Biometrika, 81, 633-648, (1993) · Zbl 0778.62022
[43] Mengersen K, Robert CP, Titterington DM (2011) Mixtures: estimation and applications. Wiley, New York
[44] Peel, D; McLachlan, GJ, Robust mixture modelling using the t distribution, Stat Comput, 10, 339-348, (2000)
[45] Quandt, RE, A new approach to estimating switching regressions, J Am Stat Assoc, 67, 306-310, (1972) · Zbl 0237.62047
[46] Quandt, RE; Ramsey, JB, Estimating mixtures of normal distributions and switching regressions, J Am Stat Assoc, 73, 730-738, (1978) · Zbl 0401.62024
[47] Santana, L; Vilca, F; Leiva, V, Influence analysis in skew-Birnbaum-Saunders regression models and applications, J Appl Stat, 38, 1633-1649, (2011) · Zbl 1218.62075
[48] Song, W; Yao, W; Xing, Y, Robust mixture regression model Fitting by Laplace distribution, Comput Stat Data Anal, 71, 128-137, (2014) · Zbl 06975377
[49] Späth, H, Algorithm 39 clusterwise linear regression, Computing, 22, 367-373, (1979) · Zbl 0387.65028
[50] Sperrin, M; Jaki, T; Wit, E, Probabilistic relabeling strategies for the label switching problem in Bayesian mixture models, Stat Comput, 20, 357-366, (2010)
[51] Stephens, M, Dealing with label switching in mixture models, J R Stat Soc Ser B, 62, 795-809, (2002) · Zbl 0957.62020
[52] Turner, TR, Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions, J R Stat Soc Ser C (Appl Stat), 49, 371-384, (2000) · Zbl 0971.62076
[53] Verbeke, G; Lesaffre, E, A linear mixed-effects model with heterogeneity in the random-effects population, J Am Stat Assoc, 91, 217-221, (1996) · Zbl 0870.62057
[54] Viele, K; Tong, B, Modeling with mixtures of linear regressions, Stat Comput, 12, 315-330, (2002)
[55] Vilca, F; Santana, L; Leiva, V; Balakrishnan, N, Estimation of extreme percentiles in Birnbaum-Saunders distributions, Comput Stat Data Anal, 55, 1665-1678, (2011) · Zbl 1328.62141
[56] Vilca, F; Balakrishnan, N; Zeller, CB, Multivariate skew-normal generalized hyperbolic distribution and its properties, J Multivar Anal, 128, 73-85, (2014) · Zbl 1352.62080
[57] Wang, HX; Zhang, QB; Luo, B; Wei, S, Robust mixture modelling using multivariate t-distribution with missing information, Pattern Recognit Lett, 25, 701-710, (2004)
[58] Wang, J; Genton, MG, The multivariate skew-slash distribution, J Stat Plan Inference, 136, 209-220, (2006) · Zbl 1081.60013
[59] Wei, GCG; Tanner, MA, A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms, J Am Stat Assoc, 85, 699-704, (1990)
[60] Wei Y (2012) Robust mixture regression models using t-distribution. In: Master report, Department of Statistics, Kansas State University · Zbl 0387.65028
[61] Yao, W; Lindsay, BG, Bayesian mixture labeling by highest posterior density, J Am Stat Assoc, 104, 758-767, (2009) · Zbl 1388.62007
[62] Yao, W, A profile likelihood method for normal mixture with unequal variance, J Stat Plan Inference, 140, 2089-2098, (2010) · Zbl 1184.62029
[63] Yao, W, Model based labeling for mixture models, Stat Comput, 22, 337-347, (2012) · Zbl 1322.62047
[64] Yao, W; Wei, Y; Yu, C, Robust mixture regression using the t-distribution, Comput Stat Data Anal, 71, 116-127, (2014) · Zbl 06975376
[65] Yao, W, Label switching and its solutions for frequentist mixture models, J Stat Comput Simul, 85, 1000-1012, (2015)
[66] Zeller, CB; Lachos, VH; Vilca-Labra, FE, Local influence analysis for regression models with scale mixtures of skew-normal distributions, J Appl Stat, 38, 348-363, (2011)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.