## Multivariate mixture modeling using skew-normal independent distributions.(English)Zbl 1239.62058

Summary: We consider a flexible class of models, with elements that are finite mixtures of multivariate skew-normal independent distributions. A general EM-type algorithm is employed for iteratively computing parameter estimates and this is discussed with emphasis on finite mixtures of skew-normal, skew-t, skew-slash and skew-contaminated normal distributions. Further, a general information-based method for approximating the asymptotic covariance matrix of the estimates is also presented. The accuracy of the associated estimates and the efficiency of some information criteria are evaluated via simulation studies. Results obtained from the analysis of artificial and real data sets are reported illustrating the usefulness of the proposed methodology. The proposed EM-type algorithm and methods are implemented in the R package mixsmsn.

### MSC:

 62H05 Characterization and structure theory for multivariate probability distributions; copulas 62H12 Estimation in multivariate analysis 65C60 Computational problems in statistics (MSC2010)

### Software:

mixsmsn; UCI-ml; R; sn
Full Text:

### References:

 [1] Akaike, H., A new look at the statistical model identification, Automatic control, IEEE transactions on, 19, 716-723, (1974) · Zbl 0314.62039 [2] Andrews, D.F.; Mallows, C.L., Scale mixtures of normal distributions, Journal of the royal statistical society, series B, 36, 99-102, (1974) · Zbl 0282.62017 [3] Arellano-Valle, R.B.; Azzalini, A., On the unification of families of skew-normal distributions, Scandinavian journal of statistics, 33, 561-574, (2006) · Zbl 1117.62051 [4] Arellano-Valle, R.B.; Genton, M.G., On fundamental skew distributions, Journal of multivariate analysis, 96, 93-116, (2005) · Zbl 1073.62049 [5] Azzalini, A., A class of distributions which includes the normal ones, Scandinavian journal of statistics, 12, 171-178, (1985) · Zbl 0581.62014 [6] Azzalini, A., The skew-normal distribution and related multivariate families, Scandinavian journal of statistics, 32, 159-188, (2005) · Zbl 1091.62046 [7] Azzalini, A.; Capitanio, A., Distributions generated by perturbation of symmetry with emphasis on a multivariate skew $$t$$-distribution, Journal of the royal statistical society, series B, 65, 367-389, (2003) · Zbl 1065.62094 [8] Azzalini, A.; Dalla Valle, A., The multivariate skew-normal distribution, Biometrika, 83, 4, 715-726, (1996) · Zbl 0885.62062 [9] Bai, Z.D.; Krishnaiah, P.R.; Zhao, L.C., On rates of convergence of efficient detection criteria in signal processing with white noise, Information theory, IEEE transactions on, 35, 380-388, (1989) · Zbl 0677.94001 [10] Basford, K.E.; Greenway, D.R.; Mclachlan, G.J.; Peel, D., Standard errors of fitted component means of normal mixtures, Computational statistics, 12, 1-17, (1997) · Zbl 0924.62055 [11] Basso, R.; Lachos, V.; Cabral, C.; Ghosh, P., Robust mixture modeling based on scale mixtures of skew-normal distributions, Computational statistics and data analysis, 54, 2926-2941, (2010) · Zbl 1284.62193 [12] Biernacki, C.; Celeux, G.; Govaert, G., Assessing a mixture model for clustering with the integrated completed likelihood, IEEE transactions on pattern analysis and machine intelligence, 22, 719-725, (2000) [13] Böhning, D., Computer-assisted analysis of mixtures and applications, (2000), Chapman & Hall, CRC · Zbl 0951.62088 [14] Branco, M.D.; Dey, D.K., A general class of multivariate skew-elliptical distributions, Journal of multivariate analysis, 79, 99-113, (2001) · Zbl 0992.62047 [15] Cabral, C.R.B.; Bolfarine, H.; Pereira, J.R.G., Bayesian density estimation using skew student-$$t$$-normal mixtures, Computational statistics & data analysis, 52, 5075-5090, (2008) · Zbl 1452.62263 [16] Dempster, A.P.; Laird, N.M.; Rubin, D.B., Maximum likelihood from incomplete data via the EM algorithm, Journal of the royal statistical society, series B, 39, 1-38, (1977) · Zbl 0364.62022 [17] Dias, J.G.; Wedel, M., An empirical comparison of EM, SEM and MCMC performancee for problematic Gaussian mixture likelihoods, Statistics and computing, 14, 323-332, (2004) [18] Efron, B.; Tibshirani, R., Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy, Statistical science, 1, 54-75, (1986) · Zbl 0587.62082 [19] Frank, A., Asuncion, A., 2010. UCI machine learning repository. http://archive.ics.uci.edu/ml. [20] Frühwirth-Schnatter, S., Finite mixture and Markov switching models, (2006), Springer Verlag · Zbl 1108.62002 [21] Frühwirth-Schnatter, S.; Pyne, S., Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-$$t$$ distributions, Biostatistics, 11, 317-336, (2010) [22] Greselin, F.; Ingrassia, S., Constrained monotone EM algorithms for mixtures of multivariate $$t$$ distributions, Statistics and computing, 20, 9-22, (2010) [23] Hathaway, R.J., A constrained formulation of maximum-likelihood estimation for normal mixture models, The annals of statistics, 13, 795-800, (1985) · Zbl 0576.62039 [24] Holzmann, H.; Munk, A.; Gneiting, T., Identifiability of finite mixtures of elliptical distributions, Scandinavian journal of statistics, 33, 753-763, (2006) · Zbl 1164.62354 [25] Ingrassia, S., A likelihood-based constrained algorithm for multivariate normal mixture models, Statistical methods and applications, 13, 151-166, (2004) · Zbl 1205.62066 [26] Ingrassia, S.; Rocci, R., Constrained monotone EM algorithms for finite mixture of multivariate gaussians, Computational statistics & data analysis, 51, 5339-5351, (2007) · Zbl 1445.62116 [27] Karlis, D.; Santourian, A., Model-based clustering with non-elliptically contoured distributions, Statistics and computing, 19, 73-83, (2009) [28] Lachos, V.H.; Ghosh, P.; Arellano-Valle, R.B., Likelihood based inference for skew normal independent linear mixed models, Statistica sinica, 20, 303-322, (2010) · Zbl 1186.62071 [29] Lachos, V.H.; Labra, F.V.; Bolfarine, H.; Ghosh, P., Multivariate measurement error models based on scale mixtures of the skew-normal distribution, Statistics, 44, 541-556, (2010) · Zbl 1291.62120 [30] Lange, K.; Sinsheim, J., Normal/independent distributions and their applications in robust regression, Journal of computational and graphical statistics, 2, 175-198, (1993) [31] Lin, T.I., Maximum likelihood estimation for multivariate skew normal mixture models, Journal of multivariate analysis, 100, 257-265, (2009) · Zbl 1152.62034 [32] Lin, T.I., Robust mixture modeling using multivariate skew $$t$$ distributions, Statistics and computing, 20, 343-356, (2010) [33] Lin, T.I.; Lee, J.C.; Ni, H.F., Bayesian analysis of mixture modelling using the multivariate $$t$$ distribution, Statistics and computing, 14, 119-130, (2004) [34] Lin, T.I.; Lee, J.C.; Hsieh, W.J., Robust mixture modelling using the skew $$t$$ distribution, Statistics and computing, 17, 81-92, (2007) [35] Lin, T.I.; Lee, J.C.; Yen, S.Y., Finite mixture modelling using the skew normal distribution, Statistica sinica, 17, 909-927, (2007) · Zbl 1133.62012 [36] Liu, C.; Rubin, D.B., The ECME algorithm: a simple extension of EM and ECM with faster monotone convergence, Biometrika, 81, 633-648, (1994) · Zbl 0812.62028 [37] McLachlan, G.; Bean, R.; Jones, L.B.-T., Extension of the mixture of factor analyzers model to incorporate the multivariate $$t$$-distribution, Computational statistics & data analysis, 51, 5327-5338, (2007) · Zbl 1445.62053 [38] McLachlan, G.J.; Krishnan, T., The EM algorithm and extensions, (2008), John Wiley · Zbl 1165.62019 [39] McLachlan, G.J.; Peel, G.J., Finite mixture models, (2000), John Wiley and Sons [40] Nityasuddhi, D.; Böhning, D., Asymptotic properties of the EM algorithm estimate for normal mixture models with component specific variances, Computational statistics & data analysis, 41, 591-601, (2003) · Zbl 1429.62090 [41] Peel, D.; McLachlan, G.J., Robust mixture modelling using the $$t$$ distribution, Statistics and computing, 10, 339-348, (2000) [42] Prates, M., Lachos, V.H., Cabral, C.R.B., 2011. mixsmsn: Fitting finite mixture of scale mixture of skew-normal distributions. R package version 0.2-9. [43] Pyne, S.; Hu, X.; Wang, K.; Rossin, E.; Lin, T.; Maier, L.M., Automated high-dimensional flow cytometric data analysis, Proceedings of the national Academy of sciences USA, 106, 8519-8524, (2009) [44] R Development Core Team. 2011. R: a language and environment for statistical computing. R foundation for statistical computing, Vienna (Austria). ISBN 3-900051-07-0. [45] Sahu, S.K.; Dey, D.K.; Branco, M.D., A new class of multivariate skew distributions with applications to Bayesian regression models, The Canadian journal of statistics, 31, 129-150, (2003) · Zbl 1039.62047 [46] Schwarz, G., Estimating the dimension of a model, Annals of statistics, 6, 461-464, (1978) · Zbl 0379.62005 [47] Sfikas, G., Nikou, C., Galatsanos, N., 2007. Robust image segmentation with mixtures of student’s $$t$$-distributions. In: IEEE international conference on image processing, 2007. ICIP 2007, vol. 1. [48] Shoham, S., Robust clustering by deterministic agglomeration EM of mixtures of multivariate $$t$$-distributions, Pattern recognition, 35, 1127-1142, (2002) · Zbl 1005.68051 [49] Shoham, S.; Fellows, M.R.; Normann, R.A., Robust, automatic spike sorting using mixtures of multivariate $$t$$-distributions, Journal of neuroscience methods, 127, 111-122, (2003) [50] Titterington, D.M.; Smith, A.F.M.; Makov, U.E., Statistical analysis of finite mixture distributions, (1985), John Wiley and Sons · Zbl 0646.62013 [51] Wang, J.; Genton, M.G., The multivariate skew-slash distribution, Journal of statistical planning and inference, 136, 209-220, (2006) · Zbl 1081.60013 [52] Wang, H.X.; Zhang, Q.B.; Luo, B.; Wei, S., Robust mixture modelling using multivariate $$t$$-distribution with missing information, Pattern recognition letters, 25, 701-710, (2004) [53] Yakowitz, S.J.; Spragins, J.D., On the identifiability of finite mixtures, The annals of mathematical statistics, 39, 209-214, (1968) · Zbl 0155.25703 [54] Yu, C.; Zhang, Q.; Guo, L., Robust clustering algorithms based on finite mixtures of multivariate $$t$$ distribution, Lecture notes in computer science, 4221, 606-609, (2006)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.