Block clustering with Bernoulli mixture models: comparison of different approaches. (English) Zbl 1452.62444

Summary: The block or simultaneous clustering problem on a set of objects and a set of variables is embedded in the mixture model. Two algorithms have been developed: block EM as part of the maximum likelihood and fuzzy approaches, and block CEM as part of the classification maximum likelihood approach. A unified framework for obtaining different variants of block EM is proposed. These variants are studied and their performances evaluated in comparison with block CEM, two-way EM and two-way CEM, i.e EM and CEM applied separately to the two sets.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62-08 Computational methods for problems pertaining to statistics
Full Text: DOI


[1] Arabie, P.; Hubert, L.J., The bond energy algorithm revisited, IEEE trans. systems man cybernet., 20, 268-274, (1990)
[2] Bock, H., Simultaneous clustering of objects and variables, (), 187-203
[3] Celeux, G.; Govaert, G., Clustering criteria for discrete data and latent class models, J. classification, 8, 2, 157-176, (1991) · Zbl 0775.62150
[4] Celeux, G.; Govaert, G., A classification EM algorithm for clustering and two stochastic versions, Comput. statist. data anal., 14, 3, 315-332, (1992) · Zbl 0937.62605
[5] Celeux, G.; Govaert, G., Comparison of the mixture and the classification maximum likelihood in cluster analysis, J. statist. comput. simulation, 47, 127-146, (1993)
[6] Cheng, Y., Church, G., 2000. Biclustering of expression data. In: ISMB2000, 8th International Conference on Intelligent Systems for Molecular Biology. San Diego, California, August 19-23, pp. 93-103.
[7] Day, N., Estimating the components of a mixture of normal distributions, Biometrika, 56, 463-474, (1969) · Zbl 0183.48106
[8] Dhillon, I., 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In: Seventh ACM SIGKDD Conference. San Francisco, California, USA, pp. 269-274.
[9] Duffy, D.E.; Quiroz, A.J., A permutation-based algorithm for block clustering, J. classification, 8, 65-91, (1991)
[10] Garcia, H.; Proth, J.M., A new cross-decomposition algorithm: the GPM comparison with the bond energy method, Control cybernet., 15, 155-165, (1986) · Zbl 0611.90063
[11] Govaert, G., 1983. Classification croisée. Thèse d’état, Université Paris 6, France.
[12] Govaert, G., Classification de tableaux binaires, (), 223-236
[13] Govaert, G., 1990. Classification binaire et modèles. Revue de Statistique Appliquée XXXVIII(1), 67-81.
[14] Govaert, G., Simultaneous clustering of rows and columns, Control cybernet., 24, 4, 437-458, (1995) · Zbl 0852.62055
[15] Govaert, G.; Nadif, M., Comparison of the mixture and the classification maximum likelihood in cluster analysis when data are binary, Comput. statist. data anal., 23, 65-81, (1996) · Zbl 0900.62325
[16] Govaert, G.; Nadif, M., Clustering with block mixture models, Pattern recognition, 36, 463-473, (2003)
[17] Govaert, G.; Nadif, M., An EM algorithm for the block mixture model, IEEE trans. pattern anal. Mach. intell., 27, 643-647, (2005)
[18] Govaert, G.; Nadif, M., Fuzzy clustering to estimate the parameters of block mixture models, Soft comput., 415-422, (2006)
[19] Hartigan, J.A., Clustering algorithms, (1975), Wiley New York · Zbl 0372.62040
[20] Hathaway, R.J., Another interpretation of the EM algorithm for mixture distributions, Statist. probab. lett., 4, 53-56, (1986) · Zbl 0585.62052
[21] Hofmann, T.; Puzicha, J.; Buhmann, J.M., Unsupervised texture segmentation in a deterministic annealing framework, IEEE trans. pattern anal. Mach. intell., 20, 8, 803-818, (1998)
[22] Hubert, L.J.; Arabie, P., Comparing partitions, J. classification, 2, 193-198, (1985)
[23] Jordan, M.; Ghahramani, Z.; Jaakkola, T.; Saul, K., An introduction to variational methods for graphical models, (), 105-161 · Zbl 0910.68175
[24] Marchotorchino, F., Block seriation problems: a unified approach, Appl. stochastic models data anal., 3, 73-91, (1987)
[25] McLachlan, G.J.; Peel, D., Finite mixture models, (2000), Wiley New York · Zbl 0963.62061
[26] Mechelen, I.V.; Schepers, J., A unifying model involving a categorical and/or dimensional reduction for multimode data, Comput. statist. data anal., 52, 537-549, (2007) · Zbl 1452.62472
[27] Neal, R.M., Hinton, G.E., 1998. A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan, M.I. (Ed.), Learning in Graphical Models. Kluwer Academic Publishers, Dordrecht, pp. 355-358. · Zbl 0916.62019
[28] Rand, W., Objective criteria for the evaluation of clustering methods, J. amer. statist. assoc., 66, 846-850, (1971)
[29] Symons, M.J., Clustering criteria and multivariate normal mixture, Biometrics, 37, 35-43, (1981) · Zbl 0473.62048
[30] Wolfe, J., Pattern clustering by multivariate mixture analysis, Multivariate behavioral res., 5, 329-350, (1970)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.