On \(L_1\)-norm multiclass support vector machines: methodology and theory. (English) Zbl 1172.62317

Summary: Binary support vector machines (SVMs) have been proven to deliver high performance. In multiclass classification, however, issues remain with respect to variable selection. One challenging issue is classification and variable selection in the presence of variables in the magnitude of thousands, greatly exceeding the size of training sample. This often occurs in genomics classification. To meet the challenge, this article proposes a novel multiclass support vector machine, which performs classification and variable selection simultaneously through an \(L_{1}\)-norm penalized sparse representation. The proposed methodology, together with the developed regularization solution path, permits variable selection in such a situation. For the proposed methodology, a statistical learning theory is developed to quantify the generalization error in an attempt to gain insight into the basic structure of sparse learning, permitting the number of variables to greatly exceed the sample size. The operating characteristics of the methodology are examined through both simulated and benchmark data and are compared against some competitors in terms of accuracy of prediction. The numerical results suggest that the proposed methodology is highly competitive.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI