×

The cluster correlation-network support vector machine for high-dimensional binary classification. (English) Zbl 07193768

Summary: Identifying homogeneous subsets of predictors in classification can be challenging in the presence of high-dimensional data with highly correlated variables. We propose a new method called cluster correlation-network support vector machine (CCNSVM) that simultaneously estimates clusters of predictors that are relevant for classification and coefficients of penalized SVM. The new CCN penalty is a function of the well-known Topological Overlap Matrix whose entries measure the strength of connectivity between predictors. CCNSVM implements an efficient algorithm that alternates between searching for predictors’ clusters and optimizing a penalized SVM loss function using Majorization-Minimization tricks and a coordinate descent algorithm. This combining of clustering and sparsity into a single procedure provides additional insights into the power of exploring dimension reduction structure in high-dimensional binary classification. Simulation studies are considered to compare the performance of our procedure to its competitors. A practical application of CCNSVM on DNA methylation data illustrates its good behaviour.

MSC:

62J07 Ridge regression; shrinkage estimators (Lasso)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Vapnic V. The nature of statistical learning theory. New York: Springer; 1996. [Google Scholar] · Zbl 0859.62087
[2] Bradley PS, Mangasarian OL.Feature selection via concave minimization and support vector machines. ICML. 1998;98:82-90. [Google Scholar]
[3] Zhu J, Rosset S, Hastie T, et al. 1-norm support vector machines. vol. 16 of The Annual Conference on Neural Information Processing Systems; 2004. [Google Scholar]
[4] Wang L, Zhu J, Zou H.The doubly regularized support vector machine. Stat Sin. 2006;16(2):589-615. [Web of Science ®], [Google Scholar] · Zbl 1126.68070
[5] Becker N, Toedt G, Lichter P, et al. Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data. BMC Bioinformatics. 2011;12:138. doi: 10.1186/1471-2105-12-138[Crossref], [PubMed], [Web of Science ®], [Google Scholar]
[6] Witten D, Shojaie A, Zhang F.The cluster elastic net for high-dimensional regression with unknown variable grouping. Technometrics. 2014;33:112-121. doi: 10.1080/00401706.2013.810174[Taylor & Francis Online], [Google Scholar]
[7] Zhang B, Horvath S.A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005;4. Article 17. doi: 10.2202/1544-6115.1128[Crossref], [PubMed], [Web of Science ®], [Google Scholar] · Zbl 1077.92042
[8] Yang Y, Zou H.An efficient algorithm for computing the hhsvm and its generalizations. J Comput Graph Stat. 2013;22:396-415. doi: 10.1080/10618600.2012.680324[Taylor & Francis Online], [Web of Science ®], [Google Scholar]
[9] Mkhadri A, Ouhourane M, Oualkacha K.A coordinate descent algorithm for computing penalized smooth quantile regression. Stat Comput. 2017;27:865-883. doi: 10.1007/s11222-016-9659-9[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1384.62261
[10] Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. 2nd ed.New York: Springer-Verlag; 2009. [Crossref], [Google Scholar] · Zbl 1273.62005
[11] Cortes C, Vapnik V.Support vector machine. Mach Learn. 1995;20:273-297. [Crossref], [Web of Science ®], [Google Scholar] · Zbl 0831.68098
[12] Gunn SR. Support vector machines for classification and regression, ISIS technical report, University of Southampton; 1998. [Google Scholar]
[13] Zhang C-H.Nearly unbiased variable selection under minimax concave penalty. Ann Stat. 2010;38:894-942. doi: 10.1214/09-AOS729[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1183.62120
[14] Ravasz E, Somera AL, Mongru DA, et al. Hierarchical organization of modularity in metabolic networks. Science. 2002;297(5586):1551-1555. doi: 10.1126/science.1073374[Crossref], [PubMed], [Web of Science ®], [Google Scholar]
[15] Ye Y, Godzik A.FATCAT: a web server for flexible structure comparison and structure similarity searching. Nucl Acids Res. 2004;32(suppl_2):W582-W585. doi: 10.1093/nar/gkh430[Crossref], [PubMed], [Google Scholar]
[16] Hartigan JA, Wong MA.Algorithm AS 136: a k-means clustering algorithm. J R Stat Soc Ser C (Appl. Stat.). 1979;28(1):100-108. [Crossref], [Google Scholar] · Zbl 0447.62062
[17] Friedman J, Hastie T, Tibshirani R.Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1. doi: 10.18637/jss.v033.i01[Crossref], [PubMed], [Web of Science ®], [Google Scholar]
[18] Tibshirani R, Walther G, Hastie T.Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc, Ser B. 2001;63(2):411-423. doi: 10.1111/1467-9868.00293[Crossref], [Google Scholar] · Zbl 0979.62046
[19] Charrad M, Ghazzali N, Boiteau V, Package ‘NbClust’. J Stat Softw. 2014;61:1-36. doi: 10.18637/jss.v061.i06[Crossref], [Web of Science ®], [Google Scholar]
[20] Langfelder P, Horvath S. Tutorials for the WGCNA package; 2014. [Google Scholar]
[21] Yang Y, Zou H.A fast unified algorithm for solving group-lasso penalize learning problems. Stat Comput. 2015;25:1129-1141. doi: 10.1007/s11222-014-9498-5[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1331.62343
[22] Fawcett T.An introduction to ROC analysis. Pattern Recognit Lett. 2006;27:861-874. doi: 10.1016/j.patrec.2005.10.010[Crossref], [Web of Science ®], [Google Scholar]
[23] Rand WM.Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 1971;66:846-850. doi: 10.1080/01621459.1971.10482356[Taylor & Francis Online], [Web of Science ®], [Google Scholar]
[24] Hertz GZ, Stormo GD.Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics (Oxford, England). 1999;15(7):563-577. doi: 10.1093/bioinformatics/15.7.563[Crossref], [PubMed], [Web of Science ®], [Google Scholar]
[25] Turgeon M, Oualkacha K, Ciampi A, et al. Alzheimer’s disease neuroimaging initiative. Principal component of explained variance: an efficient and optimal data dimension reduction framework for association studies. Stat Methods Med Res. 2018;27(5):1331-1350. doi: 10.1177/0962280216660128[Crossref], [PubMed], [Google Scholar]
[26] Lakhal-Chaieb L, Greenwood CM, Ouhourane M, et al. A smoothed EM-algorithm for DNA methylation profiles from sequencing-based methods in cell lines or for a single cell type. Stat Appl Genet Mol Biol. 2017;16(5-6):333-347. doi: 10.1515/sagmb-2016-0062[Crossref], [PubMed], [Web of Science ®], [Google Scholar] · Zbl 1396.92060
[27] Gareth J, Witten D, Hastie T, et al. An introduction to statistical learning with applications in R. New-York: Springer; 2013. [Google Scholar] · Zbl 1281.62147
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.