Active semi-supervised fuzzy clustering. (English) Zbl 1140.68461

Summary: Clustering algorithms are increasingly employed for the categorization of image databases, in order to provide users with database overviews and make their access more effective. By including information provided by the user, the categorization process can produce results that come closer to user’s expectations. To make such a semi-supervised categorization approach acceptable for the user, this information must be of a very simple nature and the amount of information the user is required to provide must be minimized. We propose here an effective semi-supervised clustering algorithm, Active Fuzzy Constrained Clustering (AFCC), that minimizes a competitive agglomeration cost function with fuzzy terms corresponding to pairwise constraints provided by the user. In order to minimize the amount of constraints required, we define an active mechanism for the selection of candidate constraints. The comparisons performed on a simple benchmark and on a ground truth image database show that with AFCC the results of clustering can be significantly improved with few constraints, making this semi-supervised approach an attractive alternative in the categorization of image databases.


68T10 Pattern recognition, speech recognition
68P15 Database theory
68U10 Computing methodologies for image processing
Full Text: DOI


[1] A. Demiriz, K. Bennett, M. Embrechts, Semi-supervised clustering using genetic algorithms, in: Cihan Dagli, et al. (Eds.), Intelligent Engineering Systems Through Artificial Neural Networks, vol. 9, ASME Press, New York, 1999, pp. 809-814.
[2] Wagstaff, K.; Cardie, C., Clustering with instance-level constraints, (), 1103-1110
[3] Basu, S.; Banerjee, A.; Mooney, R.J., Semi-supervised clustering by seeding, (), 19-26
[4] Jain, A.K.; Dubes, R.C., Algorithms for clustering data, (1988), Prentice-Hall, Inc. Englewood Cliffs, NJ · Zbl 0665.62061
[5] Jain, A.K.; Murty, M.N.; Flynn, P.J., Data clustering: a review, ACM comput. surv., 31, 3, 264-323, (1999)
[6] D. Cohn, R. Caruana, A. McCallum, Semi-supervised clustering with user feedback, Unpublished manuscript, 2000. · Zbl 1161.68759
[7] Klein, D.; Kamvar, S.D.; Manning, C.D., From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering, (), 307-314
[8] Xing, E.P.; Ng, A.Y.; Jordan, M.I.; Russell, S., Distance metric learning with application to clustering with side-information, (), 505-512
[9] Bilenko, M.; Mooney, R.J., Adaptive duplicate detection using learnable string similarity measures, (), 39-48
[10] Bezdek, J.C., Pattern recognition with fuzzy objective function algorithms, (1981), Plenum Press New York · Zbl 0503.68069
[11] Frigui, H.; Krishnapuram, R., Clustering by competitive agglomeration, Pattern recognition, 30, 7, 1109-1119, (1997)
[12] Grira, N.; Crucianu, M.; Boujemaa, N., Semi-supervised fuzzy clustering with pairwise-constrained competitive agglomeration, ()
[13] Basu, S.; Banerjee, A.; Mooney, R., Semi-supervised clustering by seeding, ()
[14] T. Hofmann, J.M. Buhmann, Active data clustering, in: Advances in Neural Information Processing Systems (NIPS), vol. 10, 1997, pp. 528-534.
[15] Gath, I.; Geva, A.B., Unsupervised optimal fuzzy clustering, IEEE trans. pattern anal. Mach. intell., 11, 7, 773-780, (1989) · Zbl 0709.62592
[16] Boujemaa, N.; Fauqueur, J.; Ferecatu, M.; Fleuret, F.; Gouet, V.; Saux, B.L.; Sahbi, H., Ikona: interactive generic and specific image retrieval, ()
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.