×

Robust mixture modeling based on scale mixtures of skew-normal distributions. (English) Zbl 1284.62193

Summary: A flexible class of probability distributions, convenient for modeling data with skewness behavior, discrepant observations and population heterogeneity is presented. The elements of this family are convex linear combinations of densities that are scale mixtures of skew-normal distributions. An EM-type algorithm for maximum likelihood estimation is developed and the observed information matrix is obtained. These procedures are discussed with emphasis on finite mixtures of skew-normal, skew-\(t\), skew-slash and skew contaminated normal distributions. In order to examine the performance of the proposed methods, some simulation studies are presented to show the advantage of this flexible class in clustering heterogeneous data and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. A real data set is analyzed, illustrating the usefulness of the proposed methodology.

MSC:

62F35 Robustness and adaptive procedures (parametric inference)
62E15 Exact distribution theory in statistics
62-07 Data analysis (statistics) (MSC2010)

Software:

AS 136; sn
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Akaike, H., A new look at the statistical model identification, IEEE transactions on automatic control, 19, 716-723, (1974) · Zbl 0314.62039
[2] Andrews, D.F.; Mallows, C.L., Scale mixtures of normal distributions, Journal of the royal statistical society, series B, 36, 99-102, (1974) · Zbl 0282.62017
[3] Arellano-Valle, R.B.; Branco, M.D.; Genton, M.G., A unified view on skewed distributions arising from selections, Canadian journal of statistics, 34, (2006) · Zbl 1121.60009
[4] Arnold, B..C.; Beaver, R.J.; Groeneveld, R.A.; Meeker, W.Q., The nontruncated marginal of a truncated bivariate normal distribution, Psychometrika, 58, 471-488, (1993) · Zbl 0794.62075
[5] Azzalini, A., A class of distributions which includes the normal ones, Scandinavian journal of statistics, 12, 171-178, (1985) · Zbl 0581.62014
[6] Azzalini, A., The skew-normal distribution and related multivariate families, Scandinavian journal of statistics, 32, 159-188, (2005) · Zbl 1091.62046
[7] Azzalini, A.; Capitanio, A., Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution, Journal of the royal statistical society, series B, 65, 367-389, (2003) · Zbl 1065.62094
[8] Bai, Z.D.; Krishnaiah, P.R.; Zhao, L.C., On rates of convergence of efficient detection criteria in signalprocessing with white noise, IEEE transactions on information theory, 35, 380-388, (1989) · Zbl 0677.94001
[9] Basford, K.E.; Greenway, D.R.; Mclachlan, G.J.; Peel, D., Standard errors of fitted component means of normal mixtures, Computational statistics, 12, 1-17, (1997) · Zbl 0924.62055
[10] Biernacki, C.; Celeux, G.; Govaert, G., Assessing a mixture model for clustering with the integratedcompleted likelihood, IEEE transactions on pattern analysis and machine intelligence, 22, 719-725, (2000)
[11] Biernacki, C.; Celeux, G.; Govaer, G., Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models, Computational statistics & data analysis, 41, 561-575, (2003) · Zbl 1429.62235
[12] Böhning, D., Computer-assisted analysis of mixtures and applications, (2000), Chapman & Hall/CRC · Zbl 0951.62088
[13] Branco, M.D.; Dey, D.K., A general class of multivariate skew-elliptical distributions, Journal of multivariate analysis, 79, 99-113, (2001) · Zbl 0992.62047
[14] Dempster, A.P.; Laird, N.M.; Rubin, D.B., Maximum likelihood from incomplete data via the EM algorithm, Journal of the royal statistical society, series B, 39, 1-38, (1977) · Zbl 0364.62022
[15] Dias, J.G.; Wedel, M., An empirical comparison of EM, SEM and MCMC performance for problematic Gaussian mixture likelihoods, Statistics and computing, 14, 323-332, (2004)
[16] DiCiccio, T.J.; Monti, A.C., Inferential aspects of the skew exponential power distribution, Journal of the American statistical association, 99, 439-450, (2004) · Zbl 1117.62318
[17] Efron, B.; Tibshirani, R., Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy, Statistical science, 1, 54-75, (1986) · Zbl 0587.62082
[18] Frühwirth-Schnatter, S., Finite mixture and Markov switching models, (2006), Springer Verlag · Zbl 1108.62002
[19] Hartigan, J.A.; Wong, M.A., A k-means clustering algorithm, Applied statistics, 28, 100-108, (1979) · Zbl 0447.62062
[20] Henze, N., A probabilistic representation of the skew-normal distribution, Scandinavian journal of statistics, 13, 271-275, (1986) · Zbl 0648.62016
[21] Lachos, V.H.; Bolfarine, H.; Arellano-Valle, R.B.; Montenegro, L.C., Likelihood-based inference for multivariate skew-normal regression models, Communications in statistics - theory and methods, 36, 1769-1786, (2007) · Zbl 1124.62037
[22] Lachos, V.H., Ghosh, P., Arellano-Valle, R.B., 2010. Likelihood based inference for skew normal independent linear mixed models. Statistica Sinica (in press) · Zbl 1186.62071
[23] Lin, T.I., Maximum likelihood estimation for multivariate skew normal mixture models, Journal of multivariate analysis, 100, 257-265, (2009) · Zbl 1152.62034
[24] Lin, T.I., Robust mixture modeling using multivariate skew t distributions, Statistics and computing, (2009)
[25] Lin, T.I.; Lee, J.C.; Hsieh, W.J., Robust mixture modelling using the skew t distribution, Statistics and computing, 17, 81-92, (2007)
[26] Lin, T.I.; Lee, J.C.; Yen, S.Y., Finite mixture modelling using the skew normal distribution, Statistica sinica, 17, 909-927, (2007) · Zbl 1133.62012
[27] Liu, C.; Rubin, D.B., The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence, Biometrika, 81, 633-648, (1994) · Zbl 0812.62028
[28] McLachlan, G.J.; Krishnan, T., The EM algorithm and extensions, (2008), John Wiley · Zbl 1165.62019
[29] McLachlan, G.J.; Peel, D., Robust cluster analysis via mixtures of multivariate t-distributions, (), 658-666
[30] McLachlan, G.J.; Peel, G.J., Finite mixture models, (2000), John Wiley and Sons
[31] Meng, X.L.; Rubin, D.B., Maximum likelihood estimation via the ECM algorithm: A general framework, Biometrika, 80, 267-278, (1993) · Zbl 0778.62022
[32] Nityasuddhi, D.; Böhning, D., Asymptotic properties of the EM algorithm estimate for normal mixture models with component specific variances, Computational statistics & data analysis, 41, 591-601, (2003) · Zbl 1429.62090
[33] Peel, D.; McLachlan, G.J., Robust mixture modelling using the t distribution, Statistics and computing, 10, 339-348, (2000)
[34] Schwarz, G., Estimating the dimension of a model, Annals of statistics, 6, 461-464, (1978) · Zbl 0379.62005
[35] Verbeke, G.; Lesaffre, E., A linear mixed-effects model with heterogeneity in the random-effects population, Journal of the American statistical association, 91, (1996) · Zbl 0870.62057
[36] Wang, J.; Genton, M.G., The multivariate skew-slash distribution, Journal of statistical planning and inference, 136, 209-220, (2006) · Zbl 1081.60013
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.