Fuzzy clustering of spatial binary data. (English) Zbl 1274.62418

Summary: An iterative fuzzy clustering method is proposed to partition a set of multivariate binary observation vectors located at neighboring geographic sites. The method applies in a binary setup a recently proposed algorithm, called neighborhood EM, which seeks a partition that is both well clustered in the feature space and spatially regular. This approach is derived from the EM algorithm applied to mixture models [A.P. Dempster, N.M. Laird and D.B. Rubin, J. R. Stat. Soc., Ser. B 39, 1–38 (1977; Zbl 0364.62022)], viewed as an alternate optimization method [R.J. Hathaway, Stat. Probab. Lett. 4, 53–56 (1986; Zbl 0585.62052)]. The criterion optimized by EM is penalized by a spatial smoothing term that favors classes having many neighbors. The resulting algorithm has a structure similar to EM, with an unchanged M-step and an iterative E-step. The criterion optimized by neighborhood EM is closely related to a posterior distribution with a multilevel logistic Markov random field as prior. The application of this approach to binary data relies on a mixture of multivariate Bernoulli distributions. Experiments on simulated spatial binary data yield encouraging results.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H86 Multivariate analysis and fuzziness
62M40 Random fields; image analysis
65C60 Computational problems in statistics (MSC2010)


mixture models
Full Text: Link


[1] Ambroise C.: Approche probabiliste en classification automatique et contraintes de voisinage. PhD Thesis, Université de Technologie de Compiègne 1996
[2] Ambroise C., Dang M. V., Govaert G.: Clustering of spatial data by the EM algorithm. Amílcar Soares (J. Gómez-Hernandez and R. Froidevaux, geoENV I - Geostatistics for Environmental Applications, Kluwer Academic Publisher 1997, pp. 493-504
[3] Ambroise C., Govaert G.: An iterative algorithm for spatial clustering, submitte. · Zbl 1029.62056
[4] Berry B. J. L.: Essay on Commodity Flows and the Spatial Structure of the Indian Economy. Research paper 111, Departement of Geography, University of Chicago 1966
[5] Besag J. E.: Spatial analysis of dirty pictures. J. Roy. Statist. Soc. 48 (1986), 259-302 · Zbl 0609.62150
[6] Bezdek J. C., Castelaz P. F.: Prototype classification and feature selection with fuzzy sets. IEEE Trans. Systems Man Cybernet. SMC-7 (1977), 2, 87-92 · Zbl 0359.68120
[7] Celeux G., Govaert G.: Clustering criteria for discrete data and latent class models. J. Classification 8 (1991), 157-176 · Zbl 0775.62150
[8] Chalmond B.: An iterative gibbsian technique for reconstruction of m-ary images. Pattern Recognition 22 (1989), 6, 747-761
[9] Dempster A. P., Laird N. M., Rubin D. B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. 39 (1977), 1-38 · Zbl 0364.62022
[10] Geman S., Geman D.: Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans. Pattern Analysis Machine Intelligence PAMI-6 (1984), 721-741 · Zbl 0573.62030
[11] Govaert G.: Classification binaire et modéles. Rev. Statist. Appl. 38 (1990), 1, 67-81
[12] Hathaway R. J.: Another interpretation of the EM algorithm for mixture distributions. Statist. Probab. Lett. 4 (1986), 53-56 · Zbl 0585.62052
[13] Legendre P.: Constrained clustering. Develop. Numerical Ecology. NATO ASI Series G 14 (1987), 289-307
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.