Fuzzy clustering based on nonconvex optimisation approaches using difference of convex (DC) functions algorithms. (English) Zbl 1301.90072

Summary: We present a fast and robust nonconvex optimization approach for Fuzzy C-Means (FCM) clustering model. Our approach is based on DC (Difference of Convex functions) programming and DCA (DC Algorithms) that have been successfully applied in various fields of applied sciences, including Machine Learning. The FCM model is reformulated in the form of three equivalent DC programs for which different DCA schemes are investigated. For accelerating the DCA, an alternative FCM-DCA procedure is developed. Experimental results on several real world problems that include microarray data illustrate the effectiveness of the proposed algorithms and their superiority over the standard FCM algorithm, with respect to both running-time and accuracy of solutions.


90C26 Nonconvex programming, global optimization
62H30 Classification and discrimination; cluster analysis (statistical aspects)
90C70 Fuzzy and other nonstochastic uncertainty mathematical programming
Full Text: DOI


[1] Alon N, Spencer JH (1991) The probabilistic method. Wiley, New York
[2] Arora S, Kannan R (2001) Learning mixtures of arbitrary Gaussians. In: Proceedings of 33rd annual ACM symposium on theory of computing, pp 247–257 · Zbl 1323.68440
[3] Bradley BS, Mangasarian OL (1998). Feature selection via concave minimization and support vector machines. In: Shavlik J (eds) Machine learning Proceedings of the 15th international conferences (ICML 1998). Morgan Kaufman, San Francisco, pp. 82–90
[4] Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithm. Plenum Press, New York · Zbl 0503.68069
[5] Dhilon IS, Korgan J, Nicholas C (2003). Feature selection and document clustering. In: Berry MW (ed) A comprehensive survey of text mining. Springer, Berlin, pp. 73–100
[6] Dembélé D, Kastner P (2003) Fuzzy c-means clustering method for clustering microarray data. Bioinformatics 19(8):573–580
[7] Duda RO, Hart PE (1972) Pattern classification and scene analysis. Wiley, New York
[8] Feder T, Greene D (1988) Optimal algorithms for approximate clustering. Proc STOC Chicago, Illinois, pp 434–444, ISBN: 0-89791-264-0
[9] Fisher D (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2:139–172
[10] Fukunaga K (1990) Statistical pattern recognition. Academic, New York · Zbl 0711.62052
[11] Krause N, Singer Y (2004) Leveraging the margin more carefully. In: Proceedings of International conference on machine learning ICML, V Banff, Alberta, Canada pp. 63–70, ISBN: 1-58113-828-5
[12] Klawonn F, Höppner F (2003) What is fuzzy about fuzzy clustering? Understanding and improving the concept of the fuzzifier. In: Berthold MR, Lenz H-J, Bradley E, Kruse R, Borgelt C (eds) Advances in intelligent data analysis. Springer, Berlin, pp 254–264
[13] Le Thi HA (1997) Contribution à l’optimisation non convexe et l’optimisation globale: théorie, algorithmes et applications. Habilitation à Diriger des Recherches, Université de Rouen, France
[14] Le Thi HA, Pham Dinh T (1997) Solving a class of linearly constrained indefinite quadratic problems by DC algorithms. J Global Optim 11(3):253–285 · Zbl 0905.90131
[15] Le Thi HA, Pham Dinh T (2005) The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann Oper Res 133:23–46 · Zbl 1116.90122
[16] Le Thi HA, Belghiti T, Pham Dinh T (2007) A new efficient algorithm based on DC programming and DCA for Clustering. J Global Optim 37:593–608 · Zbl 1198.90327
[17] Le Thi HA, Le Hoai M, Pham Dinh T (2007) Optimization based DC programming and DCA for Hierarchical Clustering. Eur J Oper Res 183:1067–1085 · Zbl 1149.90117
[18] Liu Y, Shen X, Doss H (2005) Multicategory {\(\psi\)}-learning and support vector machine: computational tools. J Comput Graph Stat 14:219–236
[19] Liu Y, Shen X (2006) Multicategory{\(\psi\)}-learning. J Am Stat Assoc 101:500–509 · Zbl 1119.62341
[20] Mangasarian OL (1997) Mathematical programming in data mining. Data Min Knowl Discov 1:183–201
[21] MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley symposium on mathematical statistics and probability. University of California Press, Berkeley 1:281–297
[22] Neumann J, Schnörr C, Steidl G (2004) SVM-based feature selection by direct objective minimisation. Pattern Recognition. In: Proceedings of the 26th DAGM symposium, pp 212–219
[23] Pham Dinh T, Le Thi HA (1998) DC optimization algorithms for solving the trust region subproblem. SIAM J Optim 8:476–505 · Zbl 0913.65054
[24] Polyak B (1987) Introduction to optimization. Optimization Software, Inc., Publication Division, New York · Zbl 0625.62093
[25] Rajapakse JC, Giedd JN, Rapoport JL (2004) Statistical approach to segmentation of single-channel cerebral MR images. IEEE Trans Medical Imaging 16: 176–186
[26] Rockfellar RT (1970) Convex Analysis. Princeton: Princeton University · Zbl 0193.18401
[27] Ronan C, Fabian S, Jason W, Léon B (2006) Trading convexity for scalability. In: International conference on machine learning (ICML), Pittsburgh, Pennsylvania, pp. 201–208, ISBN: 1-59593-383-2
[28] Shen X, Tseng GC, Zhang X, Wong WH (2003) {\(\psi\)}-learning. J Am Stat Assoc 98:724–734 · Zbl 1052.62095
[29] Weber S, Schüle T, Schnörr C (2005) Prior learning and convex–concave regularization of binary tomography. Electron Notes Discr Math 20:313–327 · Zbl 1179.68192
[30] Yuille AL, Rangarajan A (2003) The convex conCave procedure (CCCP). Neural Comput 15:915-936 · Zbl 1022.68112
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.