×

A tree-structured framework for purifying “complex” clusters with structural roles of individual data. (English) Zbl 1213.68523

Summary: How can we find a natural clustering of a “complex” dataset, which may contain an unknown number of overlapping clusters of arbitrary shape and be contaminated by noise? A tree-structured framework is proposed in this paper to purify such clusters by exploring the structural role of each data. In practice, each individual object within the internal organization of the data has its own specific role-“centroid”, hub or outlier-due to distinctive associations with their respective neighbors. Adjacent centroids always interact on each other and serve as mediate nodes of one tree being members of some cluster. Hubs closed to some centroid become leaf nodes responsible for the termination of the growth of trees. Outliers that weakly touch with any centroid are often discarded from any trees as global noise. All the data can thus be labeled by a specified criterion of “centroids”-connected structural consistency (CCSC). Free of domain-specific information, our framework with CCSC could widely adapt to many clustering-related applications. Theoretical and experimental contributions both confirm that our framework is easy to interpret and implement, efficient and effective in “complex” clustering.

MSC:

68T10 Pattern recognition, speech recognition
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Filippone, M.; Camastra, F.; Masulli, F.; Rovetta, S., A survey of kernel and spectral methods for clustering, Pattern Recognition, 41, 1, 176-190 (2008) · Zbl 1122.68530
[2] Beisner, H. M., Celebrating 40 years of pattern recognition: reflections, Pattern Recognition, 41, 7, 2139-2144 (2008)
[3] Warren Liao, T., Clustering of time series data survey, Pattern Recognition, 38, 1857-1874 (2005) · Zbl 1077.68803
[4] Ning, J.; Zhang, L.; Zhang, D.; Wub, C., Interactive image segmentation by maximal similarity based region merging, Pattern Recognition, 43, 445-456 (2010) · Zbl 1187.68485
[5] H.M. Beisner, SEP/COP: An efficient method to find the best partition in hierarchical clustering based on a new cluster validity index, Pattern Recognition, 2010, in press, doi:10.1016/j.patcog.2010.04.021; H.M. Beisner, SEP/COP: An efficient method to find the best partition in hierarchical clustering based on a new cluster validity index, Pattern Recognition, 2010, in press, doi:10.1016/j.patcog.2010.04.021
[6] Fischer, B.; Zöller, T.; Buhmann, J. M., Path based pairwise data clustering with application to texture segmentation, Energy Minimization Methods in Computer Vision and Pattern Recognition, 2134, 235-250 (2001) · Zbl 1001.68765
[7] Chang, H.; Yeung, D.-Y., Robust path-based spectral clustering with application to image segmentation, (10th IEEE International Conference on Computer Vision (2005), IEEE Computer Society Press: IEEE Computer Society Press Beijing, China), 278-285
[8] Sharon, E.; Galun, M.; Sharon, D.; Basri, R.; Brandt, A., Hierarchy and adaptivity in segmenting visual scenes, Nature, 442, 7104, 810-813 (2006)
[9] X.-W. Xu, N. Yuruk, Z.-D. Feng, T.A.J. Schweiger, SCAN: a structural clustering algorithm for networks, in: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, 2007, pp. 824-833.; X.-W. Xu, N. Yuruk, Z.-D. Feng, T.A.J. Schweiger, SCAN: a structural clustering algorithm for networks, in: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, 2007, pp. 824-833.
[10] S.-G. Zhou, Y. Zhao, J.-H. Guan, J.-S. Huang, A neighborhood-based clustering algorithm, in: Proceedings of Ninth Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, 2005, pp. 361-371.; S.-G. Zhou, Y. Zhao, J.-H. Guan, J.-S. Huang, A neighborhood-based clustering algorithm, in: Proceedings of Ninth Pacific-Asia Conference on Knowledge Discovery and Data Mining, Hanoi, Vietnam, 2005, pp. 361-371.
[11] J.-D. Ding, R.-N. Ma, S.-C. Chen, J.-Y. Yang, Clustering using normalized path-based metric, in: Proceedings of Fifth International Symposium on Neural Network (ISNN2008), Part II, 2008, pp. 57-66.; J.-D. Ding, R.-N. Ma, S.-C. Chen, J.-Y. Yang, Clustering using normalized path-based metric, in: Proceedings of Fifth International Symposium on Neural Network (ISNN2008), Part II, 2008, pp. 57-66.
[12] J.-D. Ding, S.-C. Chen, R.-N. Ma, B. Wang, A fast directed tree based neighborhood clustering for image segmentation, in: Proceedings of the 13th International Conference on Neural Information Processing, Part II, Lecture Notes in Computer Science, vol. 4233, Berlin, Heidelberg, 2006, pp. 369-378.; J.-D. Ding, S.-C. Chen, R.-N. Ma, B. Wang, A fast directed tree based neighborhood clustering for image segmentation, in: Proceedings of the 13th International Conference on Neural Information Processing, Part II, Lecture Notes in Computer Science, vol. 4233, Berlin, Heidelberg, 2006, pp. 369-378.
[13] Theodoridis, S.; Koutroumbas, K., Pattern Recognition (1999), Academic Press: Academic Press NewYork
[14] Comaniciu, D.; Meer, P., Mean shift: a robust approach toward feature space analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 5, 603-619 (2002)
[15] Cai, W.-L.; Chen, S.-C.; Zhang, D.-Q., Fast and robust fuzzy c-means clustering algorithms incorporationg local information for image segmentation, Pattern Recognition, 40, 825-838 (2007) · Zbl 1118.68133
[16] Chen, S.-C.; Zhang, D.-Q., Robust image segmentation using FCM with spatial constraints based on new kernel-induced distance measure, IEEE Transactions on Systems, Man, and Cybernetics—B: Cybernetics, 34, 4, 1907-1916 (2004)
[17] Frey, B. J.; Dueck, D., Clustering by passing messages between data points, Science, 315, 5814, 972-976 (2007) · Zbl 1226.94027
[18] Wallace, D. A.R., Groups, Rings and Fields: 31-31, Th. 8 (1998), Springer-Verlag
[19] Fischer, B.; Buhmann, J. M., Path-based clustering for grouping of smooth curves and texture segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 4, 513-518 (2003)
[20] C. Ding, X.-F. He, K-nearest-neighbor consistency in data clustering: incorporating local information into global optimization, in: Proceedings of 2004 ACM Symposium on Applied Computing, 2004, pp. 584-589.; C. Ding, X.-F. He, K-nearest-neighbor consistency in data clustering: incorporating local information into global optimization, in: Proceedings of 2004 ACM Symposium on Applied Computing, 2004, pp. 584-589.
[21] Zhou, D.-Y.; Bousquet, O.; Lal, T. N.; Weston, J.; Schölkopf, B., Learning with local and global consistency, (Advances in Neural Information Processing Systems, vol. 16 (2004), MIT Press: MIT Press MA, USA), 321-328
[22] Breitenbach, M.; Grudic, G. Z., Clustering through ranking on manifolds, (International Conference on Machine Learning, vol. 119 (2005), ACM: ACM USA, New York), 73-80
[23] Ester, M.; Kriegel, H. P.; Sander, J.; Xu, X., A density-based algorithm for discovery clusters in large spatial databases with noise, (International Conference on Knowledge Discovery and Data Mining (1996), AAAI Press: AAAI Press Beijing, China), 221-226
[24] Koontz, W.; Narendra, P.; Fukunaga, K., A graph-theoretic approach to nonparametric cluster analysis, IEEE Transactions on Computer, C-25, 9, 936-944 (1976) · Zbl 0332.68080
[25] Hofmann, T.; Buhmann, J. M., Pairwise data clustering by deterministic annealing, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 1, 1-14 (1997)
[26] Shi, J.-B.; Malik, J., Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 8, 888-905 (2000)
[27] Muller, K. R.; Mika, S.; Ratsch, G.; Schölkipf, B., An introduction to kernel-based learning algorithms, IEEE Transactions on Neural Networks, 12, 2, 181-201 (2001)
[28] Felzenszwalb, P.; Huttenlocher, D., Efficient graph-based image segmentation, Interantional Journal of Computer Vision, 59, 167-181 (2004) · Zbl 1477.68505
[29] R.O. Duda, P.E. Hart, Pattern Classification and Scene Analysis, John Wiley & Sons, 1973, ISBN 0-471-22361-1, See p. 218.; R.O. Duda, P.E. Hart, Pattern Classification and Scene Analysis, John Wiley & Sons, 1973, ISBN 0-471-22361-1, See p. 218. · Zbl 0277.68056
[30] D. Martin, C. Fowlkes, D. Tal, J. Malik, A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in: IEEE International Conference on Computer Vision, Vancouver, Canada, 2001, pp. 416-425.; D. Martin, C. Fowlkes, D. Tal, J. Malik, A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, in: IEEE International Conference on Computer Vision, Vancouver, Canada, 2001, pp. 416-425.
[31] Ding, J.-D.; Ma, R.-N.; Chen, S.-C., A scale-based coherence connected tree algorithm for image segmentation, IEEE Transactions on Image Processing, 17, 2, 204-216 (2008)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.