Inference for multivariate normal mixtures. (English) Zbl 1162.62052

Summary: Multivariate normal mixtures provide a flexible model for high-dimensional data. They are widely used in statistical genetics, statistical finance, and other disciplines. Due to the unboundedness of the likelihood function, classical likelihood-based methods, which may have nice practical properties, are inconsistent. We recommend a penalized likelihood method for estimating the mixing distribution. We show that the maximum penalized likelihood estimator is strongly consistent when the number of components has a known upper bound. We also explore a convenient EM-algorithm for computing the maximum penalized likelihood estimator. Extensive simulations are conducted to explore the effectiveness and the practical limitations of both the new method and the ratified maximum likelihood estimators. Guidelines are provided based on the simulation results.


62H12 Estimation in multivariate analysis
62F12 Asymptotic properties of parametric estimators
65C60 Computational problems in statistics (MSC2010)
62F10 Point estimation
Full Text: DOI arXiv


[1] Titterington, D.M.; Smith, A.F.M.; Makov, U.E., Statistical analysis of finite mixture distributions, (1985), Wiley Chichester · Zbl 0646.62013
[2] Lindsay, B.G., Mixture models: theory, geometry and applications, (1995), Institute for Mathematical Statistics Hayward · Zbl 1163.62326
[3] MacLachlan, G.J.; Peel, D., Finite mixture models, (2000), Wiley New York
[4] Fruhwirth-Schnatter, S., Finite mixture and Markov switching models, (2006), Springer · Zbl 1108.62002
[5] Schork, N.; Allison, D.; Thiel, B., Mixture distributions in human genetics research, Stat. methods med. res., 5, 155-178, (1996)
[6] Tadesse, M.; Sha, N.; Vannucci, M., Bayesian variable selection in clustering high-dimensional data, J. amer. statist. assoc., 100, 602-617, (2005) · Zbl 1117.62433
[7] Fraley, C.; Raftery, A.E., How many clusters? which clustering method? answers via model-based cluster analysis, The computer J., 41, 578-588, (1998) · Zbl 0920.68038
[8] Lin, S.; Biswas, S., On modelling locus heterogeneity using mixture distributions, BMC genetics, 5, 29, (2004)
[9] Raftery, A.E.; Dean, N., Variable selection for model-based clustering, J. amer. statist. assoc., 101, 168-178, (2006) · Zbl 1118.62339
[10] Alexandridis, R.; Lin, S.; Irwin, M., Class discovery and classification of tumor samples using mixture modeling of gene expression data—A unified approach, Bioinformatics, 20, 2545-2552, (2004)
[11] Lindsay, B.G.; Basak, P., Multivariate normal mixtures: A fast consistent method of moments, J. amer. statist. assoc., 86, 468-476, (1993) · Zbl 0773.62037
[12] Day, N.E., Estimating the components of a mixture of normal distributions, Biometrika, 56, 463-474, (1969) · Zbl 0183.48106
[13] Hathaway, R.J., A constrained formulation of maximum-likelihood estimation for normal mixture distributions, Ann. statist., 13, 795-800, (1985) · Zbl 0576.62039
[14] Tan, X.; Chen, J.; Zhang, R., Consistency of the constrained maximum likelihood estimator in finite normal mixture models, (), 2113-2119, [CD-ROM] · Zbl 0473.76118
[15] Ingrassia, S., A likelihood-based constrained algorithm for multivariate normal mixture models, Stat. methods appl., 13, 151-166, (2004) · Zbl 1205.62066
[16] Ray, S.; Lindsay, B.G., The topography of multivariate normal mixtures, Ann. statist., 33, 2042-2065, (2005) · Zbl 1086.62066
[17] Green, P.J., On use of the EM algorithm for penalized likelihood estimation, J. roy. statist. soc. ser. B, 52, 443-452, (1990) · Zbl 0706.62022
[18] Eggermont, P.B.; LaRiccia, V.N., Maximum penalized likelihood estimation, vol. I, (2001), Springer New York · Zbl 0984.62026
[19] Chen, J.; Tan, X.; Zhang, R., Inference for normal mixture in Mean and variance, Statist. sinica, 18, 443-465, (2008) · Zbl 1135.62018
[20] Lehmann, E.L., Theory of point estimation, (1983), John Wiley & Sons · Zbl 0522.62020
[21] Wu, C.-F., On the convergence properties of the EM algorithm, Ann. statist., 11, 95-103, (1983) · Zbl 0517.62035
[22] Richard, R.A.; Walker, H.F., Mixture densities, maximum likelihood and the EM algorithm, SIAM rev., 26, 195-239, (1984) · Zbl 0536.62021
[23] James, W.; Stein, C., (), 361-379
[24] Wald, A., Note on the consistency of the maximum likelihood estimate, Ann. math. statist., 20, 595-601, (1949) · Zbl 0034.22902
[25] Kiefer, J.; Wolfowitz, J., Consistency on the maximum likelihood estimator in the presence of infinitely many incidental parameters, Ann. math. statist., 27, 887-906, (1956) · Zbl 0073.14701
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.