×

A mixture of generalized hyperbolic distributions. (English. French summary) Zbl 1320.62144

Summary: We introduce a mixture of generalized hyperbolic distributions as an alternative to the ubiquitous mixture of Gaussian distributions as well as their near relatives within which the mixture of multivariate \(t\)-distributions and the mixture of skew-\(t\) distributions predominate. The mathematical development of our mixture of generalized hyperbolic distributions model relies on its relationship with the generalized inverse Gaussian distribution. The latter is reviewed before our mixture models are presented along with details of the aforesaid reliance. Parameter estimation is outlined within the expectation-maximization framework before the clustering performance of our mixture models is illustrated via applications on simulated and real data. In particular, the ability of our models to recover parameters for data from underlying Gaussian and skew-\(t\) distributions is demonstrated. Finally, the role of generalized hyperbolic mixtures within the wider model-based clustering, classification, and density estimation literature is discussed.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

mixsmsn; QRM; mclust; S-PLUS; R
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Aitken, On Bernoulli’s numerical solution of algebraic equations, Proceedings of the Royal Society of Edinburgh 46 pp 289– (1926) · JFM 52.0098.05
[2] Andrews, Extending mixtures of multivariate t-factor analyzers, Statistics and Computing 21 (3) pp 361– (2011) · Zbl 1255.62171
[3] Andrews, Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions, Statistics and Computing 22 (5) pp 1021– (2012) · Zbl 1252.62062
[4] Andrews, Variable selection for clustering and classification, Journal of Classification 31 (2) pp 136– (2014) · Zbl 1360.62310
[5] Atienza, A new condition for identifiability of finite mixture distributions, Metrika 63 pp 215– (2006) · Zbl 1095.62016
[6] Banfield, Model-based Gaussian and non-Gaussian clustering, Biometrics 49 (3) pp 803– (1993) · Zbl 0794.62034
[7] Baricz, Turán type inequalities for some probability density functions, Studia Scientiarum Mathematicarum Hungarica 47 pp 175– (2010) · Zbl 1234.62010
[8] Barndorff-Nielsen, Hyperbolic distributions and distributions on hyperbolae, Scandinavian Journal of Statistics 5 pp 151– (1978) · Zbl 0386.60018
[9] Barndorff-Nielsen, Infinite divisibility of the hyperbolic and generalized inverse Gaussian distributions, Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 38 pp 309– (1977) · Zbl 0403.60026
[10] Biernacki, Assessing a mixture model for clustering with the integrated completed likelihood, IEEE Transactions on Pattern Analysis and Machine Intelligence 22 pp 719– (2000)
[11] Blaesild , P. 1978
[12] Böhning, The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family, Annals of the Institute of Statistical Mathematics 46 pp 373– (1994) · Zbl 0802.62017
[13] Browne, mixture: Mixture Models for Clustering and Classification (2013)
[14] Browne, Estimating common principal components in high dimensions, Advances in Data Analysis and Classification 8 (2) pp 217– (2014)
[15] Browne, Model-based learning using a mixture of mixtures of Gaussian and uniform distributions, IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (4) pp 814– (2012)
[16] Cabral, Multivariate mixture modeling using skew-normal independent distributions, Computational Statistics and Data Analysis 56 (1) pp 126– (2012) · Zbl 1239.62058
[17] Campbell, A multivariate study of variation in two species of rock crab of genus leptograpsus, Australian Journal of Zoology 22 pp 417– (1974)
[18] Celeux, Gaussian parsimonious clustering models, Pattern Recognition 28 (5) pp 781– (1995) · Zbl 05480211
[19] Chen, Inference for multivariate normal mixtures, Journal of Multivariate Analysis 100 (7) pp 1367– (2009) · Zbl 1162.62052
[20] Dasgupta, Detecting features in spatial point processes with clutter via model-based clustering, Journal of the American Statistical Association 93 pp 294– (1998) · Zbl 0906.62105
[21] de Leeuw, Sharp quadratic majorization in one dimension, Computational Statistics and Data Analysis 53 (7) pp 2471– (2009) · Zbl 1453.62078
[22] Dean, Using unlabelled data to update classification rules with applications in food authenticity studies, Journal of the Royal Statistical Society, Series C 55 (1) pp 1– (2006) · Zbl 1490.62155
[23] Dempster, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B 39 (1) pp 1– (1977) · Zbl 0364.62022
[24] Fraley, Model-based clustering, discriminant analysis, and density estimation, Journal of the American Statistical Association 97 (458) pp 611– (2002) · Zbl 1073.62545
[25] Fraley , C. Raftery , A. E. Scrucca , L. 2013 mclust: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation
[26] Franczak, Mixtures of shifted asymmetric Laplace distributions, IEEE Transactions on Pattern Analysis and Machine Intelligence 36 (6) pp 1149– (2014)
[27] Good, The population frequencies of species and the estimation of population parameters, Biometrika 40 pp 237– (1953) · Zbl 0051.37103
[28] Halgreen, Self-decomposibility of the generalized inverse Gaussian and hyperbolic distributions, Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 47 pp 13– (1979) · Zbl 0377.60020
[29] Handcock, Model-based clustering for social networks, Journal of the Royal Statistical Society, Series A 170 (2) pp 301– (2007) · Zbl 05273954
[30] Hastie, Discriminant analysis by Gaussian mixtures, Journal of the Royal Statistical Society, Series B 58 (1) pp 155– (1996) · Zbl 0850.62476
[31] Hathaway, A constrained formulation of maximum-likelihood estimation for normal mixture distributions, The Annals of Statistics 13 (2) pp 795– (1985) · Zbl 0576.62039
[32] Holzmann, Identifiability of finite mixtures of elliptical distributions, Scandinavian Journal of Statistics 33 pp 753– (2006) · Zbl 1164.62354
[33] Hubert, Comparing partitions, Journal of Classification 2 (1) pp 193– (1985) · Zbl 0587.62128
[34] Jørgensen, Statistical Properties of the Generalized Inverse Gaussian Distribution (1982) · Zbl 0486.62022
[35] Karlis, Finite mixtures of multivariate Poisson distributions with application, Journal of Statistical Planning and Inference 137 (6) pp 1942– (2007) · Zbl 1116.60006
[36] Kass, Bayes factors, Journal of the American Statistical Association 90 (430) pp 773– (1995)
[37] Kass, A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion, Journal of the American Statistical Association 90 (431) pp 928– (1995) · Zbl 0851.62020
[38] Kent, Identifiability of finite mixtures for directional data, The Annals of Statistics 11 pp 984– (1983) · Zbl 0515.62018
[39] Kotz, The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance (2001) · Zbl 0977.62003
[40] Lee, Finite mixtures of multivariate skew t-distributions: Some recent and new results, Statistics and Computing 24 (2) pp 181– (2014) · Zbl 1325.62107
[41] Lin, Maximum likelihood estimation for multivariate skew normal mixture models, Journal of Multivariate Analysis 100 pp 257– (2009) · Zbl 1152.62034
[42] Lin, Robust mixture modeling using multivariate skew t distributions, Statistics and Computing 20 (3) pp 343– (2010)
[43] Lin, Capturing patterns via parsimonious t mixture models, Statistics and Probability Letters 88 pp 80– (2014) · Zbl 1369.62131
[44] Lindsay, NSF-CBMS Regional Conference Series in Probability and Statistics (1995)
[45] McLachlan, Discriminant Analysis and Statistical Pattern Recognition (1992) · Zbl 1108.62317
[46] McNeil, Quantitative Risk Management: Concepts, Techniques and Tools (2005) · Zbl 1089.91037
[47] McNicholas, Model-based classification using latent Gaussian mixture models, Journal of Statistical Planning and Inference 140 (5) pp 1175– (2010) · Zbl 1181.62095
[48] McNicholas, Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models, Computational Statistics and Data Analysis 54 (3) pp 711– (2010) · Zbl 1464.62131
[49] McNicholas , S. M. McNicholas , P. D. Browne , R. P. 2014
[50] Melnykov, On the distribution of posterior probabilities in finite mixture models with application in clustering, Journal of Multivariate Analysis 122 pp 175– (2013) · Zbl 1279.62114
[51] Murray, Mixtures of skew-t factor analyzers, Computational Statistics and Data Analysis 77 pp 326– (2014) · Zbl 06984029
[52] Murray, A mixture of common skew-t factor analyzers, Stat 3 (1) pp 68– (2014)
[53] Peel, Robust mixture modelling using the t distribution, Statistics and Computing 10 (4) pp 339– (2000)
[54] Prates , M. O. Cabral , C. B. Lachos , V. H. 2013 mixsmsn: Fitting Finite Mixture of Scale Mixture of Skew-Normal Distributions
[55] R Core Team, R: A Language and Environment for Statistical Computing (2013)
[56] Raftery, Variable selection for model-based clustering, Journal of the American Statistical Association 101 (473) pp 168– (2006) · Zbl 1118.62339
[57] Rand, Objective criteria for the evaluation of clustering methods, Journal of the American Statistical Association 66 (336) pp 846– (1971)
[58] Redner, Mixture densities, maximum likelihood and the EM algorithm, SIAM Review 26 pp 195– (1984) · Zbl 0536.62021
[59] Schwarz, Estimating the dimension of a model, Annals of Statistics 6 pp 461– (1978) · Zbl 0379.62005
[60] Stephens, Dealing with label switching in mixture models, Journal of the Royal Statistical Society, Series B 62 (4) pp 795– (2000) · Zbl 0957.62020
[61] Venables, Modern Applied Statistics with S-PLUS (1999)
[62] Vrbik, Analytic calculations for the EM algorithm for multivariate skew-t-mixture models, Statistics and Probability Letters 82 (6) pp 1169– (2012) · Zbl 1244.65012
[63] Vrbik, Parsimonious skew mixture models for model-based clustering and classification, Computational Statistics and Data Analysis 71 pp 196– (2014) · Zbl 1471.62202
[64] Watson, A Treatise on the Theory of Bessel Functions (1944) · Zbl 0063.08184
[65] Yakowitz, On the identifiability of finite mixtures, Annals of Mathematical Statistics 39 pp 209– (1968) · Zbl 0155.25703
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.