×

Separation index and partial membership for clustering. (English) Zbl 1431.62270

Summary: We propose a new separation index that measures the magnitude of gaps between any two clusters in a partition, by projecting the data in a pair of clusters into a one-dimensional space in which they have the maximum separation. The resulting projections can also be used to determine partial membership for points near the boundaries between two or more clusters. The matrix of separation indexes is helpful in deciding whether too many or too few clusters are specified in the clustering method.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

clusfind; UCI-ml; mclust
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Bezdek, J.C., Numerical taxonomy with fuzzy sets, J. math. biol., 1, 57-71, (1974) · Zbl 0403.62039
[2] Bezdek, J.C., Cluster validity with fuzzy sets, J. cybernet., 3, 58-72, (1974) · Zbl 0294.68035
[3] Bezdek, J.C., Pattern recognition with fuzzy objective function algorithms, (1981), Plenum Press New York · Zbl 0503.68069
[4] Bezdek, J.C.; Pal, N.R., Some new indexes of cluster validity, IEEE trans. systems man cybernet., 28, 301-315, (1998)
[5] Blake, C.L., Merz, C.J., 1998. UCI repository of machine learning databases. http://www.ics.uci.edu/\(\sim\)mlearn/MLRepository.html.
[6] Cormack, R.M., A review of classification, J. roy. statist. soc. ser. A, 134, 321-367, (1971)
[7] Dhillon, I.S.; Modha, D.S.; Spangler, W.S., Class visualization of high-dimensional data with applications, Comput. statist. data anal., 41, 1, 59-90, (2002) · Zbl 1101.62350
[8] Dunn, J.C., A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, J. cybernet., 3, 32-57, (1973) · Zbl 0291.68033
[9] Everitt, B., Cluster analysis, (1974), Heinemann London
[10] Fraley, C., Raftery, A.E., 1998. Mclust: software for model-based cluster and discriminant analysis. Technical Report 342, Department of Statistics, University of Washington, Seattle, WA 98195-4322, USA.
[11] Frigui, H.; Krishnapuram, R., A robust algorithm for automatic extraction of an unknown number of clusters from noisy data, Pattern recognition, 17, 1223-1232, (1996) · Zbl 0872.68163
[12] Halkidi, M.; Batistakis, Y.; Vazirgiannis, M., On clustering validation techniques, J. intell. inform. system, 17, 107-145, (2001) · Zbl 0998.68154
[13] Hastie, T.; Tibshirani, R.; Friedman, J., The elements of statistical learning, (2001), Springer Berlin
[14] Höppner, K.; Klawonn, F.; Runkler, T., Fuzzy cluster analysismethods for classification, data analysis and image recognition, (1999), Wiley New York
[15] Kaufman, L.; Rousseeuw, P.J., Finding groups in dataan introduction to cluster analysis, (1990), Wiley New York
[16] Kim, D-W.; Lee, K.H.; Lee, D., Fuzzy cluster validation index based on inter-cluster proximity, Pattern recognition lett., 24, 15, 2561-2574, (2003)
[17] Lin, C.R., Chen, M.S., 2002. A robust and efficient clustering algorithm based on cohesion self-merging. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, pp. 582-587.
[18] Milligan, G.W., A Monte Carlo study of thirty internal criterion measures for cluster analysis, Psychometrika, 46, 2, 187-199, (1981) · Zbl 0472.62070
[19] Rezaee, M.R.; Lelieveldt, B.P.F.; Reiber, J.H.C., A new cluster validity index for the fuzzy c-Mean, Pattern recognition lett., 19, 237-246, (1998) · Zbl 0905.68127
[20] Rosenblatt, F., The perceptrona probabilistic model for information storage and organization in the brain, Psychol. rev., 65, 386-408, (1958)
[21] Vapnik, V., The nature of statistical learning theory, (1996), Springer New York · Zbl 0934.62009
[22] Xie, X.L.; Beni, G., A validity measure for fuzzy clustering, IEEE trans. pattern anal. machine intell., 13, 8, 841-847, (1991)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.