×

Robust maximum entropy clustering algorithm with its labeling for outliers. (English) Zbl 1096.68728

Summary: In this paper, a novel robust maximum entropy clustering algorithm RMEC, as the improved version of the maximum entropy algorithm MEC, is presented to overcome MEC’s drawbacks: very sensitive to outliers and uneasy to label them. Algorithm RMEC incorporates Vapnik’s \(\epsilon\)-insensitive loss function and the new concept of weight factors into its objective function and consequently, its new update rules are derived according to the Lagrangian optimization theory. Compared with algorithm MEC, the main contributions of algorithm RMEC exit in its much better robustness to outliers and the fact that it can effectively label outliers in the dataset using the obtained weight factors. Our experimental results demonstrate its superior performance in enhancing the robustness and labeling outliers in the dataset.

MSC:

68T10 Pattern recognition, speech recognition
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Rose, K.; Gurewtiz, E.; Fox, G., A deterministic annealing approach to clustering, Pattern Recogn Lett, 11, 589-594 (1990) · Zbl 0800.68817
[2] Karayiannis NB (1994) MECA: Maximum entropy clustering algorithm. In: Proceedings on IEEE International conference on fuzzy systems, Orlando, pp 630-635
[3] Li RP, Mukaidono M (1995) A maximum entropy approach to fuzzy clustering. In: Proceedings on IEEE International conference fuzzy systems, Yokohama, Japan, pp 2227-2232
[4] Zhang, ZH, Maximum entropy clustering algorithm and the analysis of its global convergence performance, Science in China (Series E), 31, 1, 59-70 (2001)
[5] Las M, Kandel A (1999) Automated perceptions in dataset mining. Proceedings of the Eighth International Conference on Fuzzy System. Seoul, Korea. Part I pp 190-197
[6] Mendenhall W, Reinmuth JE, Beaver RJ (1993) Statistics for management and economics. Duxbury Press, Belmont
[7] Leski, JM, Towards a robust clustering, Fuzzy Sets Syst, 137, 2, 191-196 (2003)
[8] Vapnik V (1998) Statistical learning theory. Wiley, New York · Zbl 0935.62007
[9] Gill PE, Murray W, Wright MH (1981) Practical optimization. Academic, New York · Zbl 0503.90062
[10] Huber PJ (1981) Robust statistics. Wiley, New York · Zbl 0536.62025
[11] Wang Shitong et al (1998) Neuro-fuzzy systems and their applications. The Press of BeiJing Aeronautical University, BeiJing
[12] Steve RG (1998) Support vector machines classification and regression. Technical Report
[13] Zhaohong, Deng, Modified min-max fuzzy neural network and function modeling, J Southern Yangtze university, No, 3, 75-84 (2003)
[14] Deng Zhaohong, Wang Shitong. Visual kernel perceptron. J Southern Yangtze university · Zbl 1263.62067
[15] Shitong, Wang, Note on the relationship between probabilistic/fuzzy clustering, J Soft Comput, 8, 7, 523-526 (2004) · Zbl 1061.62093
[16] Shitong, Wang, On fuzzy morphological associative memories, IEEE Trans Fuzzy Syst, 12, 6, 316-323 (2004)
[17] Shitong, Wang, A new integrated clustering algorithm GFC and switching regressions, Int J Pattern Recog Artif Intell, 16, 4, 433-446 (2002)
[18] Bezdek JC (1982) Pattern recognition with fuzzy objective function algorithms. Plenum, New York · Zbl 0503.68069
[19] Wang Shitong et al. (2004) Fuzzy kernel clustering with outliers. Chin J Software 15(7):1021-1029 · Zbl 1107.68467
[20] Barnett V, Lewis T (1994) Outliers in statistical data. Wiley, New York · Zbl 0801.62001
[21] Kollios, G.; Gunopulos, D., Efficient biased sampling for approximate clustering and outlier detection in large data sets, IEEE Trans Knowl Data Eng, 15, 5, 1170-1187 (2003)
[22] Kailing K, Kriegel HP et al. (2004) Clustering multi-represented objects with noise. In: Proceedings 8th Pacific-Asia Conference on PAKDD’04, Australia, pp 394-403
[23] Jaing, MJ; Tseng, SS, Two-phase clustering process for outliers detection, Pattern Recog Lett, 22, 6, 691-700 (2001) · Zbl 1010.68908
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.