zbMATH — the first resource for mathematics

Robust multivariate mixture regression models with incomplete data. (English) Zbl 07191941
Summary: Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate \(t\) distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate \(t\) mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate \(t\) mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.
62 Statistics
Algorithm 39
Full Text: DOI
[1] Quandt RE. A new approach to estimating switching regressions. J Amer Statist Soc. 1972;67:306-310. doi: 10.1080/01621459.1972.10482378[Taylor & Francis Online], [Web of Science ®], [Google Scholar] · Zbl 0237.62047
[2] Späth H. Algorithm 39 clusterwise linear regression. Computing. 1979;22(4):367-373. doi: 10.1007/BF02265317[Crossref], [Web of Science ®], [Google Scholar] · Zbl 0387.65028
[3] Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE. Adaptive mixtures of local experts. Neural Comput. 1991;3:79-87. doi: 10.1162/neco.1991.3.1.79[Crossref], [Web of Science ®], [Google Scholar]
[4] McLachlan GJ, Peel D. Finite mixture models. New York: Wiley; 2000. [Crossref], [Google Scholar] · Zbl 0963.62061
[5] Frühwirth-Schnatter S. Finite mixture and Markov switching models. Heidelberg: Springer; 2006. [Google Scholar] · Zbl 1108.62002
[6] Neykov N, Filzmoser P, Dimova R, Neytchev PN. Robust fitting of mixtures using the trimmed likelihood estimator. Comput Stat Data Anal. 2007;52:299-308. doi: 10.1016/j.csda.2006.12.024[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1328.62033
[7] Bai X, Yao W, Boyer JE. Robust fitting of mixture regression models. Comput Stat Data Anal. 2012;56:2347-2359. doi: 10.1016/j.csda.2012.01.016[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1252.62011
[8] Bashir S, Carter EM. Robust mixture of linear regression models. Commun Stat - Theory Methods. 2012;41:3371-3388. doi: 10.1080/03610926.2011.558655[Taylor & Francis Online], [Web of Science ®], [Google Scholar] · Zbl 1296.62111
[9] Song W, Yao W, Xing Y. Robust mixture regression model fitting by Laplace distribution. Comput Stat Data Anal. 2014;71:128-137. doi: 10.1016/j.csda.2013.06.022[Crossref], [Web of Science ®], [Google Scholar] · Zbl 06975377
[10] Yao W, Wei Y, Yu C. Robust mixture regression using the t-distribution. Comput Stat Data Anal. 2014;71:116-127. doi: 10.1016/j.csda.2013.07.019[Crossref], [Web of Science ®], [Google Scholar] · Zbl 06975376
[11] Galimberti G, Soffritti G. A multivariate linear regression analysis using finite mixtures of t distributions. Comput Stat Data Anal. 2014;71:138-150. doi: 10.1016/j.csda.2013.01.017[Crossref], [Web of Science ®], [Google Scholar] · Zbl 06975378
[12] Little RJA, Rubin DB. Statistical analysis with missing data. 2nd ed.Hoboken (NJ): Wiley; 2002. [Crossref], [Google Scholar] · Zbl 1011.62004
[13] Wang HX, Zhang QB, Luo B, Wei S. Robust mixture modelling using multivariate t-distribution with missing information. Pattern Recognit Lett. 2004;25:701-710. doi: 10.1016/j.patrec.2004.01.010[Crossref], [Web of Science ®], [Google Scholar]
[14] Lin TI, Lee JC, Ho HJ. On fast supervised learning for normal mixture models with missing information. Pattern Recognit. 2006;39:1177-1187. doi: 10.1016/j.patcog.2005.12.014[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1096.68723
[15] Yohai VJ. High breakdown-point and high efficiency robust estimates for regression. Ann Stat. 1987;15:642-656. doi: 10.1214/aos/1176350366[Crossref], [Web of Science ®], [Google Scholar] · Zbl 0624.62037
[16] Andrews DF, Mallows CL. Scale mixtures of normal distributions. J R Statist Soc Ser B. 1974;36:99-102. [Google Scholar] · Zbl 0282.62017
[17] Huber PJ. Robust regression: asymptotics, conjectures and Monte Carlo. Ann Stat. 1973;1:799-821. doi: 10.1214/aos/1176342503[Crossref], [Web of Science ®], [Google Scholar] · Zbl 0289.62033
[18] Rousseeuw PJ, Yohai V. Robust regression by means of S-estimators. In Franke J, Härdle W, Martin RD, editors Robust and nonlinear time series analysis: Lecture Notes in Statistics, Vol. 26. New York: Springer Verlag; 1984. p. 256-272. [Crossref], [Google Scholar] · Zbl 0567.62027
[19] Peel D, McLachlan GJ. Robust mixture modelling using the t distribution. Stat Comput. 2000;10:339-348. doi: 10.1023/A:1008981510081[Crossref], [Web of Science ®], [Google Scholar]
[20] Salibian-Barrera M, Yohai VJ. A fast algorithm for S-regression estimates. J Comput Graph Stat. 2006;15(2):414-427. doi: 10.1198/106186006X113629[Taylor & Francis Online], [Web of Science ®], [Google Scholar]
[21] Greselin F, Ingrassia S. Constrained monotone EM algorithms for mixtures of multivariate t distributions. Stat Comput. 2010;20:9-22. doi: 10.1007/s11222-008-9112-9[Crossref], [Web of Science ®], [Google Scholar]
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.