×

Directional co-clustering. (English) Zbl 1474.62244

Summary: Co-clustering addresses the problem of simultaneous clustering of both dimensions of a data matrix. When dealing with high dimensional sparse data, co-clustering turns out to be more beneficial than one-sided clustering even if one is interested in clustering along one dimension only. Aside from being high dimensional and sparse, some datasets, such as document-term matrices, exhibit directional characteristics, and the \(L_2\) normalization of such data, so that it lies on the surface of a unit hypersphere, is useful. Popular co-clustering assumptions such as Gaussian or Multinomial are inadequate for this type of data. In this paper, we extend the scope of co-clustering to directional data. We present Diagonal Block Mixture of Von Mises-Fisher distributions (dbmovMFs), a co-clustering model which is well suited for directional data lying on a unit hypersphere. By setting the estimate of the model parameters under the maximum likelihood (ML) and classification ML approaches, we develop a class of EM algorithms for estimating dbmovMFs from data. Extensive experiments, on several real-world datasets, confirm the advantage of our approach and demonstrate the effectiveness of our algorithms.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H11 Directional data; spatial statistics
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Abramowitz M, Stegun IA (1964) Handbook of mathematical functions: with formulas, graphs, and mathematical tables, vol 55. Courier Corporation, North Chelmsford · Zbl 0171.38503
[2] Ailem, M.; Role, F.; Nadif, M., Graph modularity maximization as an effective method for co-clustering text data, Knowl Based Syst, 109, 160-173, (2016)
[3] Ailem, M.; Role, F.; Nadif, M., Model-based co-clustering for the effective handling of sparse data, Pattern Recognit, 72, 108-122, (2017)
[4] Ailem, M.; Role, F.; Nadif, M., Sparse poisson latent block model for document clustering, IEEE Trans Knowl Data Eng, 29, 1563-1576, (2017)
[5] Akaike, H.; Parzen, E. (ed.); Tanabe, K. (ed.); Kitagawa, G. (ed.), Information theory and an extension of the maximum likelihood principle, 199-213, (1998), New York
[6] Banerjee, A.; Dhillon, IS; Ghosh, J.; Sra, S., Clustering on the unit hypersphere using von Mises-Fisher distributions, J Mach Learn Res, 6, 1345-1382, (2005) · Zbl 1190.62116
[7] Banfield, JD; Raftery, AE, Model-based Gaussian and non-Gaussian clustering, Biometrics, 49, 803-821, (1993) · Zbl 0794.62034
[8] Biernacki, C.; Celeux, G.; Govaert, G., Assessing a mixture model for clustering with the integrated completed likelihood, IEEE TPAMI, 22, 719-725, (2000)
[9] Bock, HH; Tomassone, R. (ed.), Simultaneous clustering of objects and variables, 187-203, (1979), Le Chesnay
[10] Bock HH (1994) Information and entropy in cluster analysis. In: Bozdogan H et al (eds) Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: an informational approach. Springer, Dordrecht, pp 115-147
[11] Bozdogan, H., Akaike’s information criterion and recent developments in information complexity, J Math Psychol, 44, 62-91, (2000) · Zbl 1047.62501
[12] Celeux, G.; Diebolt, J., The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem, Comput Stat Q, 2, 73-82, (1985)
[13] Celeux, G.; Diebolt, J., A stochastic approximation type EM algorithm for the mixture problem, Stoch Int J Probab Stoch Process, 41, 119-134, (1992) · Zbl 0766.62050
[14] Celeux, G.; Govaert, G., A classification EM algorithm for clustering and two stochastic versions, Comput Stat Data Anal, 14, 315-332, (1992) · Zbl 0937.62605
[15] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, J R Stat Soc Ser B, 39, 1-38, (1977) · Zbl 0364.62022
[16] Deodhar, M.; Ghosh, J., Scoal: a framework for simultaneous co-clustering and learning from complex data, ACM Trans Knowl Discov Data, 4, 11, (2010)
[17] Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: ACM SIGKDD, pp 269-274
[18] Dhillon, IS; Modha, DS, Concept decompositions for large sparse text data using clustering, Mach Learn, 42, 143-175, (2001) · Zbl 0970.68167
[19] Dhillon IS, Mallela S, Modha DS (2003) Information-theoretic co-clustering. In: ACM SIGKDD, pp 89-98. ACM
[20] Gopal S, Yang Y (2014) Von Mises-Fisher clustering models. In: ICML, pp 154-162
[21] Govaert, G., Simultaneous clustering of rows and columns, Control Cybern, 24, 437-458, (1995) · Zbl 0852.62055
[22] Govaert G, Nadif M (2013) Co-Clustering. Wiley, New York · Zbl 1416.62309
[23] Govaert G, Nadif M (2016) Mutual information, phi-squared and model-based co-clustering for contingency tables. Advances in Data Analysis and Classification pp 1-34
[24] Hanczar B, Nadif M (2010) Bagging for biclustering: application to microarray data. In: ECML/PKDD, pp 490-505
[25] Hartigan JA (1975) Clustering algorithms, 99th edn. Wiley, New York · Zbl 0372.62040
[26] Hubert, L.; Arabie, P., Comparing partitions, J Classif, 2, 193-218, (1985) · Zbl 0587.62128
[27] Labiod L, Nadif M (2011) Co-clustering for binary and categorical data with maximum modularity. In: ICDM, pp 1140-1145
[28] Laclau, C.; Nadif, M., Diagonal latent block model for binary data, Stat Comput, 27, 1145-1163, (2017) · Zbl 1505.62238
[29] Li T (2005) A general model for clustering binary data. In: SIGKDD, pp 188-197
[30] Madeira, SC; Oliveira, AL, Biclustering algorithms for biological data analysis: a survey, IEEE/ACM TCBB, 1, 24-45, (2004)
[31] Mardia KV, Jupp PE (2000) Directional statistics. Wiley series in probability and statistics. Wiley, New York · Zbl 0935.62065
[32] McLachlan G, Krishnan T (2007) The EM algorithm and extensions. Wiley, New York · Zbl 0882.62012
[33] McLachlan G, Peel D (2004) Finite mixture models. Wiley, New York · Zbl 0963.62061
[34] Nadif M, Govaert G (2010) Model-based co-clustering for continuous data. In: ICMLA, pp 175-180 · Zbl 1187.62117
[35] Reisinger J, Waters A, Silverthorn B, Mooney RJ (2010) Spherical topic models. In: ICML, pp 903-910
[36] Salah, A.; Nadif, M., Social regularized von Mises-Fisher mixture model for item recommendation, Data Min Knowl Discov, 31, 1-24, (2017) · Zbl 1411.68153
[37] Schwarz, G., Estimating the dimension of a model, Ann Stat, 6, 461-464, (1978) · Zbl 0379.62005
[38] Strehl, A.; Ghosh, J., Cluster ensembles—a knowledge reuse framework for combining multiple partitions, JMLR, 3, 583-617, (2003) · Zbl 1084.68759
[39] van Dijk B, van Rosmalen J, Paap R (2009) A Bayesian approach to two-mode clustering. Econometric Institute, Erasmus University Rotterdam, Report no EI 2009-06, pp 1-26
[40] Mechelen, I.; Bock, HH; Boeck, P., Two-mode clustering methods: a structured overview, Stat Methods Med Res, 13, 363-394, (2004) · Zbl 1053.62078
[41] Vichi, M.; Borra, S. (ed.); Rocci, R. (ed.); Vichi, M. (ed.); Schader, M. (ed.), Double \(k\)-means clustering for simultaneous classification of objects and variables, 43-52, (2001), Berlin, Heidelberg
[42] Wyse, J.; Friel, N., Block clustering with collapsed latent block models, Stat Comput, 22, 415-428, (2012) · Zbl 1322.62046
[43] Zhong, S.; Ghosh, J., Generative model-based document clustering: a comparative study, Knowl Inf Syst, 8, 374-384, (2005)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.