×

A DC programming approach for sparse linear discriminant analysis. (English) Zbl 1327.90241

Do, Tien Van (ed.) et al., Advanced computational methods for knowledge engineering. Proceedings of the 2nd international conference on computer science, applied mathematics and applications, ICCSAMA 2014, Budapest, Hungary, May 8–9, 2014. Cham: Springer (ISBN 978-3-319-06568-7/pbk; 978-3-319-06569-4/ebook). Advances in Intelligent Systems and Computing 282, 65-74 (2014).
Summary: We consider the supervised pattern classification in the high-dimensional setting, in which the number of features is much larger than the number of observations. We present a novel approach to the sparse linear discriminant analysis (LDA) using the zero-norm. The resulting optimization problem is non-convex, discontinuous and very hard to solve. We overcome the discontinuity by using an appropriate continuous approximation to zero-norm such that the resulting problem can be formulated as a DC (Difference of Convex functions) program to which DC programming and DC Algorithms (DCA) can be investigated. The computational results show the efficiency and the superiority of our approach versus the \(l_1\) regularization model on both feature selection and classification.
For the entire collection see [Zbl 1294.68017].

MSC:

90C26 Nonconvex programming, global optimization
90C30 Nonlinear programming
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Bickel, P., Levina, E.: Some theory for Fisher’s linear discriminant function, naive Bayes, and some alternatives when there are many more variables than observations. Bernoulli 6, 989-1010 (2004) · Zbl 1064.62073
[2] Bradley, P.S., Magasarian, O.L., Street, W.N.: Feature Selection via mathematical Programming. INFORMS Journal on Computing, 209-217 (1998) · Zbl 1034.90529
[3] Clemmensen, L., Hastie, T., Witten, D., Ersboll, B.: Sparse discriminant analysis. Technometrics 53(4), 406-413 (2011)
[4] Duan, K.B., Rajapakse, J.C., Wang, H., Azuaje, F.: Multiple SVM-RFE for Genne Selection in Cancer Classification With Expression Data. IEEE Transactions on Nanobioscience 4, 228-234 (2005)
[5] Dudoit, S., Fridlyand, J., Speed, T.: Comparison of discrimination methods for the classification of tumors using gene expression data. J. Amer. Statist. Assoc. 96, 1151-1160 (2001) · Zbl 1073.62576
[6] Fisher, R.A.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7, 179-188 (1936)
[7] Friedman, J.: Regularized discriminant analysis. Journal of the American Statistical Association 84, 165-175 (1989)
[8] Guo, Y., Hastie, T., Tibshirani, R.: Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8, 86-100 (2007) · Zbl 1170.62382
[9] Krzanowski, W., Jonathan, P., McCarthy, W., Thomas, M.: Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. Journal of the Royal Statistical Society, Series C 44, 101-115 (1995) · Zbl 0821.62032
[10] Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex func-tions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research 133, 23-46 (2005) · Zbl 1116.90122
[11] Le Thi, H.A., Pham Dinh, T.: Solving a class of linearly constrained indefinite quadratic problems by DC algorithms. Journal of Global Optimization 11(3), 253-285 (1997) · Zbl 0905.90131
[12] Le Thi, H.A., Le Hoai, M., Pham Dinh, T.: Optimization based DC programming and DCA for Hierarchical Clustering. European Journal of Operational Research 183, 1067-1085 (2007) · Zbl 1149.90117
[13] Le Thi, H.A., Le Hoai, M., Nguyen, N.V., Pham Dinh, T.: A DC Programming approach for Feature Selection in Support Vector Machines learning. Journal of Advances in Data Analysis and Classification 2(3), 259-278 (2008) · Zbl 1284.90057
[14] Le Thi, H.A.: DC Programming and DCA, http://lita.sciences.univ-metz.fr/ lethi/DCA.html · Zbl 1322.90072
[15] Le Thi, H.A., Pham Dinh, T.: DC optimization algorithm for solving the trust region subproblem. SIAM Journal of Optimization 8(1), 476-505 (1998) · Zbl 0913.65054
[16] Le Thi, H.A., Huynh, V., Pham Dinh, T.: Exact penalty and error bounds in DC programming. Journal of Global Optimization 52(3), 509-535 (2011) · Zbl 1242.49037
[17] Liu, Y., Shen, X., Doss, H.: Multicategory ψ-Learning and Support Vector Machine: Computational Tools. Journal of Computational and Graphical Statistics 14, 219-236 (2005)
[18] Liu, Y., Shen, X.: Multicategory ψ-Learning. Journal of the American Statistical Association 101, 500-509 (2006) · Zbl 1119.62341
[19] Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to D.C. programming: Theory, algorithms and applications. Acta Mathematica Vietnamica 22(1), 289-355 (1997) · Zbl 0895.90152
[20] Pham Dinh, T., Le Thi, H.A.: DC optimization algorithms for solving the trust region subproblem. SIAM J. Opt. 8, 476-505 (1998) · Zbl 0913.65054
[21] Thiao, M., Pham Dinh, T., Le Thi, H.A.: DC programming approach for a class of nonconvex programs involving l0 norm. In: Le Thi, H.A., Bouvry, P., Pham Dinh, T. (eds.) MCO 2014. CCIS, vol. 14, pp. 358-367. Springer, Heidelberg (2008) · Zbl 1160.90626
[22] Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. 99, 6567-6572 (2002)
[23] Witten, Tibshirani: Penalized classification using Fisher’s linear discriminant. Journal Royal Statistical Society, 753-772 (2011) · Zbl 1228.62079
[24] Xu, P. · Zbl 1453.62255
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.