Model-based cluster and discriminant analysis with the MIXMOD software. (English) Zbl 1157.62431

Summary: The Mixture Modeling (MIXMOD) program fits mixture models to a given data set for the purposes of density estimation, clustering or discriminant analysis. A large variety of algorithms to estimate the mixture parameters are proposed (EM, Classification EM, Stochastic EM), and it is possible to combine these to yield different strategies for obtaining a sensible maximum for the likelihood (or complete-data likelihood) function. MIXMOD is currently intended to be used for multivariate Gaussian mixtures, and fourteen different Gaussian models can be distinguished according to different assumptions regarding the component variance matrix eigenvalue decomposition. Moreover, different information criteria for choosing a parsimonious model (the number of mixture components, for instance) are included, their suitability depending on the particular perspective (cluster analysis or discriminant analysis). Written in C++, MIXMOD is interfaced with SCILAB and MATLAB. The program, the statistical documentation and the user guide are available on the internet at the following address: http://www-math.univ-fcomte.fr/mixmod/index.php.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62-04 Software, source code, etc. for problems pertaining to statistics
65C60 Computational problems in statistics (MSC2010)


Matlab; Mixmod; Scilab
Full Text: DOI HAL


[1] Banfield, J.D.; Raftery, A.E., Model-based gaussian and non-Gaussian clustering, Biometrics, 49, 803-821, (1993) · Zbl 0794.62034
[2] Bensmail, H.; Celeux, G., Regularized Gaussian discriminant analysis through eigenvalue decomposition, J. amer. statist. assoc., 91, 2, 1743-17448, (1996) · Zbl 0885.62068
[3] Biernacki, C.; Govaert, G., Choosing models in model-based clustering and discriminant analysis, J. statist. comput. simulation, 64, 49-71, (1999) · Zbl 1156.62335
[4] Biernacki, C.; Celeux, G.; Govaert, G., An improvement of the NEC criterion for assessing the number of clusters in a mixture model, Pattern recognition lett., 20, 267-272, (1999) · Zbl 0933.68117
[5] Biernacki, C.; Celeux, G.; Govaert, G., Assessing a mixture model for clustering with the integrated completed likelihood, IEEE trans. pattern analysis and machine intelligence, 22, 7, 719-725, (2000)
[6] Biernacki, C.; Beninel, F.; Bretagnolle, V., A generalized discriminant rule when training population and test population differ on their descriptive parameters, Biometrics, 58, 2, 387-397, (2002) · Zbl 1210.62077
[7] Biernacki, C.; Celeux, G.; Govaert, G., Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models, Comput. statist. data anal., 41, 561-575, (2003) · Zbl 1429.62235
[8] Bozdogan, H., Choosing the number of component clusters in the mixture-model using a new informational complexity criterion of the inverse-Fisher information matrix, (), 40-54
[9] Celeux, G.; Diebolt, J., The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem, Comput. statist. quart., 2, 73-82, (1985)
[10] Celeux, G.; Govaert, G., A classification EM algorithm for clustering and two stochastic versions, Comput. statist. data anal., 14, 3, 315-332, (1992) · Zbl 0937.62605
[11] Celeux, G.; Govaert, G., Gaussian parsimonious clustering models, Pattern recognition, 28, 5, 781-793, (1995)
[12] Celeux, G.; Soromenho, G., An entropy criterion for assessing the number of clusters in a mixture model, J. classification, 13, 195-212, (1996) · Zbl 0861.62051
[13] Dempster, A.P.; Laird, N.M.; Rubin, D.B., Maximum likelihood from incomplete data via the EM algorithm (with discussion), J. roy. statist. soc. B, 39, 1-38, (1977) · Zbl 0364.62022
[14] Diday, E.; Govaert, G., Classification avec distance adaptative, C. R. acad. sci. Paris, Sér. A, 278, 993-995, (1974) · Zbl 0283.68065
[15] Fraley, C.; Raftery, A.E., How many clusters? which clustering method? answers via model-based cluster analysis, Comput. J., 41, 578-588, (1998) · Zbl 0920.68038
[16] Friedman, H.P.; Rubin, J., On some invariant criteria for grouping data, J. amer. statist. assoc., 62, 1159-1178, (1967)
[17] Kéribin, C., Consistent estimation of the order of mixture models, Sankhyã ser. A, 1, 49-66, (2000) · Zbl 1081.62516
[18] Maronna, R.; Jacovkis, P., Multivariate clustering procedure with variable metrics, Biometrics, 30, 499-505, (1974) · Zbl 0285.62036
[19] McLachlan, G.J., Discriminant analysis and statistical pattern recognition, (1992), Wiley New York
[20] McLachlan, G.J.; Krishnan, K., The EM algorithm, (1997), Wiley New York · Zbl 0882.62012
[21] McLachlan, G.J.; Peel, D., Finite mixture models, (2000), Wiley New York · Zbl 0963.62061
[22] Schroeder, A., Analyse d’un mélange de distributions de probabilité de même type, Rev. statist. appl., 24, 1, 39-62, (1976)
[23] Schwarz, G., Estimating the number of components in a finite mixture model, Ann. statist., 6, 461-464, (1978) · Zbl 0379.62005
[24] Scott, A.J.; Symons, M.J., Clustering methods based on likelihood ratio criteria, Biometrics, 27, 387-397, (1971)
[25] Ward, J., Hierarchical grouping to optimize an objective function, J. amer. statist. assoc., 58, 236-244, (1963)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.