×

Cost-sensitive support vector machine using randomized dual coordinate descent method for big class-imbalanced data classification. (English) Zbl 1470.62097

Summary: Cost-sensitive support vector machine is one of the most popular tools to deal with class-imbalanced problem such as fault diagnosis. However, such data appear with a huge number of examples as well as features. Aiming at class-imbalanced problem on big data, a cost-sensitive support vector machine using randomized dual coordinate descent method (CSVM-RDCD) is proposed in this paper. The solution of concerned subproblem at each iteration is derived in closed form and the computational cost is decreased through the accelerating strategy and cheap computation. The four constrained conditions of CSVM-RDCD are derived. Experimental results illustrate that the proposed method increases recognition rates of positive class and reduces average misclassification costs on real big class-imbalanced data.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T05 Learning and adaptive systems in artificial intelligence

Software:

LIBLINEAR
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Davenport, M. A., The 2nu-SVM: a cost-sensitive extension of the nu-SVM, TREE 0504 (2005), Department of Electrical and Computer Engineering, Rice University
[2] Kim, M., Large margin cost-sensitive learning of conditional random fields, Pattern Recognition, 43, 10, 3683-3692 (2010) · Zbl 1209.68464 · doi:10.1016/j.patcog.2010.05.013
[3] Park, Y.-J.; Chun, S.-H.; Kim, B.-C., Cost-sensitive case-based reasoning using a genetic algorithm: application to medical diagnosis, Artificial Intelligence in Medicine, 51, 2, 133-145 (2011) · doi:10.1016/j.artmed.2010.12.001
[4] Kim, J.; Choi, K.; Kim, G.; Suh, Y., Classification cost: an empirical comparison among traditional classifier, Cost-Sensitive Classifier, and MetaCost, Expert Systems with Applications, 39, 4, 4013-4019 (2012) · doi:10.1016/j.eswa.2011.09.071
[5] Yang, C.-Y.; Yang, J.-S.; Wang, J.-J., Margin calibration in SVM class-imbalanced learning, Neurocomputing, 73, 1-3, 397-411 (2009) · doi:10.1016/j.neucom.2009.08.006
[6] Masnadi-Shirazi, H.; Vasconcelos, N.; Iranmehr, A., Cost-sensitive support vector machines
[7] Artan, Y.; Haider, M. A.; Langer, D. L.; van der Kwast, T. H.; Evans, A. J.; Yang, Y.; Wernick, M. N.; Trachtenberg, J.; Yetik, I. S., Prostate cancer localization with multispectral MRI using cost-sensitive support vector machines and conditional random fields, IEEE Transactions on Image Processing, 19, 9, 2444-2455 (2010) · Zbl 1371.94029 · doi:10.1109/TIP.2010.2048612
[8] Hsieh, C.-J.; Chang, K.-W.; Lin, C.-J.; Keerthi, S. S.; Sundararajan, S., A dual coordinate descent method for large-scale linear SVM, Proceedings of the 25th International Conference on Machine Learning
[9] Fan, R.-E.; Chang, K.-W.; Hsieh, C.-J.; Wang, X.-R.; Lin, C.-J., LIBLINEAR: a library for large linear classification, The Journal of Machine Learning Research, 9, 1871-1874 (2008) · Zbl 1225.68175
[10] Eitrich, T.; Lang, B., Parallel cost-sensitive support vector machine software for classification, Proceedings of the Workshop from Computational Biophysics to Systems Biology, John von Neumann Institute for Computing
[11] Tang, B.; Liu, W.; Song, T., Wind turbine fault diagnosis based on Morlet wavelet transformation and Wigner-Ville distribution, Renewable Energy, 35, 12, 2862-2866 (2010) · doi:10.1016/j.renene.2010.05.012
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.