×

Latent block model for contingency table. (English) Zbl 1187.62117

Summary: Although many clustering procedures aim to construct an optimal partition of objects or, sometimes, variables, there are other methods, called block clustering methods, which simultaneously consider the two sets and organize the data into homogeneous blocks. This kind of method has practical importance in a wide variety of applications such as text and market basket data analysis. Typically, the data that arise in these applications are arranged as a two-way contingency table. Using Poisson distributions, a latent block model for these data is proposed and, setting it under the maximum likelihood approach and the classification maximum likelihood approach, various algorithms are provided. Their performances are evaluated and compared to a simple use of expectation maximization (EM) or classification EM (CEM) applied separately to the rows and columns of the contingency table.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H17 Contingency tables
65C60 Computational problems in statistics (MSC2010)
PDF BibTeX XML Cite
Full Text: DOI Link

References:

[1] DOI: 10.1109/21.47829
[2] Bock H., Analyse des Données et Informatique pp 187– (1979)
[3] DOI: 10.1016/0167-9473(92)90042-E · Zbl 0937.62605
[4] DOI: 10.1080/00949659308811525
[5] Cheng , Y. , Church , G. ( 2000 ). Biclustering of expression data. In: ISMB2000, 8th International Conference on Intelligent Systems for Molecular Biology. San Diego, August 19–23 , pp. 93 – 103 .
[6] Dhillon , I. S. ( 2001 ). Co-clustering documents and words using bipartite spectral graph partitioning. In: Seventh ACM SIGKDD Conference . San Francisco , pp. 269 – 274 .
[7] DOI: 10.1007/BF02616248
[8] Govaert , G. ( 1983 ). Classification Croisée. Thèse d’état, Université Paris 6, France .
[9] Govaert G., Data Analysis and Informatics 3 pp 223– (1984)
[10] Govaert G., Control and Cybernetics 24 pp 437– (1995)
[11] DOI: 10.1016/S0167-9473(96)00021-7 · Zbl 0900.62325
[12] DOI: 10.1016/S0031-3203(02)00074-2 · Zbl 01972076
[13] DOI: 10.1016/j.ejor.2005.10.074 · Zbl 1138.62035
[14] DOI: 10.1016/j.csda.2007.09.007 · Zbl 1452.62444
[15] Hartigan J. A., Clustering Algorithms (1975) · Zbl 0372.62040
[16] Nadif M., Knowledge Discovery in Databases pp 609– (2005)
[17] Neal R. M., Learning in Graphical Models pp 355– (1998)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.