zbMATH — the first resource for mathematics

A convergent decomposition algorithm for support vector machines. (English) Zbl 1172.90443
Summary: We consider nonlinear minimization problems with a single linear equality constraint and box constraints. In particular we are interested in solving problems where the number of variables is so huge that traditional optimization methods cannot be directly applied. Many interesting real world problems lead to the solution of large scale constrained problems with this structure. For example, the special subclass of problems with convex quadratic objective function plays a fundamental role in the training of Support Vector Machine, which is a technique for machine learning problems. For this particular subclass of convex quadratic problem, some convergent decomposition methods, based on the solution of a sequence of smaller subproblems, have been proposed. In this paper we define a new globally convergent decomposition algorithm that differs from the previous methods in the rule for the choice of the subproblem variables and in the presence of a proximal point modification in the objective function of the subproblems. In particular, the new rule for sequentially selecting the subproblems appears to be suited to tackle large scale problems, while the introduction of the proximal point term allows us to ensure the global convergence of the algorithm for the general case of nonconvex objective function. Furthermore, we report some preliminary numerical results on support vector classification problems with up to 100 thousands variables.

MSC:
 90C06 Large-scale problems in mathematical programming 68T05 Learning and adaptive systems in artificial intelligence
LIBSVM
Full Text:
References:
 [1] Auslender, A.: Asymptotic properties of the Fenchel dual functional and applications to decomposition problems. J. Optim. Theory Appl. 73, 427–449 (1992) · Zbl 0794.49026 · doi:10.1007/BF00940050 [2] Barr, R.O., Gilbert, E.G.: Some efficient algorithms for a class of abstract optimization problems arising in optimal control. IEEE Trans. Autom. Control 14, 640–652 (1969) · doi:10.1109/TAC.1969.1099299 [3] Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999) [4] Bertsekas, D., Tseng, P.: Partial proximal minimization algorithm for convex programming. SIAM J. Optim. 4, 551–572 (1994) · Zbl 0819.90069 · doi:10.1137/0804031 [5] Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation. Prentice-Hall, Englewood Cliffs (1989) · Zbl 0743.65107 [6] Bomze, I.M.: Evolution towards the Maximum clique. J. Glob. Optim. 10, 143–164 (1997) · Zbl 0880.90110 · doi:10.1023/A:1008230200610 [7] Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. Software, available at http://www.csie.ntu.edu.tw/$$\sim$$cjlin/libsvm (2001) [8] Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000) · Zbl 0994.68074 [9] Einbu, J.M.: Optimal allocation of continuous resources to several activities with a concave return function–some theoretical results. Math. Oper. Res. 3, 82–88 (1978) · Zbl 0397.90080 · doi:10.1287/moor.3.1.82 [10] Ferris, M.C., Mangasarian, O.L.: Parallel variable distribution. SIAM J. Optim. 4, 1–21 (1994) · Zbl 0820.90098 · doi:10.1137/0804047 [11] Ferris, M.C., Munson, T.S.: Interior-point methods for massive support vector machines. SIAM J. Optim. 13, 783–804 (2003) · Zbl 1039.90092 · doi:10.1137/S1052623400374379 [12] Grippo, L., Sciandrone, M.: Globally convergent block-coordinate techniques for unconstrained optimization. Optim. Methods Softw. 10(4), 587–637 (1999) · Zbl 0940.65070 · doi:10.1080/10556789908805730 [13] Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26(3), 127–136 (2000) · Zbl 0955.90128 · doi:10.1016/S0167-6377(99)00074-7 [14] Hearn, D.W., Lawphongpanich, S., Ventura, J.A.: Restricted simplicial decomposition: computation and extensions. Math. Program. Study 31, 99–118 (1987) · Zbl 0636.90027 [15] Joachims, T.: Making large scale SVM learning practical. In: Schölkopf, C.B.B., Smola, A. (eds.) Advances in Kernel Methods–Support Vector Learning. MIT, Cambridge (1998) [16] Kao, C., Lee, L.-F., Pitt, M.M.: Simulated Maximum Likelihood Estimation of the linear expenditure system with binding non-negativity constraints. Ann. Econ. Finance 2, 203–223 (2001) [17] Kiwiel, K.C.: A dual method for certain positive semidefinite quadratic problems. SIAM J. Sci. Stat. Comput. 10, 175–186 (1989) · Zbl 0663.65063 · doi:10.1137/0910013 [18] Lin, C.-J.: On the convergence of the decomposition method for support vector machines. IEEE Trans. Neural Netw. 12, 1288–1298 (2001) · Zbl 1012.94501 · doi:10.1109/72.963765 [19] Lin, C.-J.: Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Trans. Neural Netw. 13, 248–250 (2002) · doi:10.1109/72.977319 [20] Lin, C.-J.: A formal analysis of stopping criteria of decomposition methods for support vector machines. IEEE Trans. Neural Netw. 13, 1045–1052 (2002) · doi:10.1109/TNN.2002.1031937 [21] Lucidi, S., Sciandrone, M., Tseng, P.: Objective-derivative-free methods for constrained optimization. Math. Program. 92(1), 37–59 (2002) · Zbl 1024.90062 · doi:10.1007/s101070100266 [22] Mangasarian, O.L.: Generalized support vector machines. In: Smola, A., Bartlett, P., Schölkopf, B., Schurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 135–146. MIT, Cambridge (2000) [23] Mangasarian, O.L., Musicant, D.R.: Successive overrelaxation for support vector machines. IEEE Trans. Neural Netw. 10, 1032–1037 (1999) · doi:10.1109/72.788643 [24] Melman, A., Rabinowitz, G.: An efficient method for a class of continuous knapsack problems. SIAM Rev. 42, 440–448 (2000) · Zbl 0958.65058 · doi:10.1137/S0036144598330177 [25] Motzkin, T.S., Strauß, E.G.: Maxima for graphs and a new proof of a theorem of Turan. Can. J. Math. 17, 533–540 (1965) · Zbl 0129.39902 · doi:10.4153/CJM-1965-053-6 [26] Nielsen, S.S., Zenios, S.A.: Massively parallel algorithms for singly constrained convex programming. ORSA J. Comput. 4, 166–181 (1992) · Zbl 0771.90079 [27] Pang, J.S.: A new and efficient algorithm for a class of portfolio selection problem. Oper. Res. 28, 754–767 (1980) · Zbl 0451.90011 · doi:10.1287/opre.28.3.754 [28] Patriksson, M.: Decomposition methods for differentiable optimization problems over Cartesian product sets. Comput. Optim. Appl. 9, 5–42 (1998) · Zbl 0905.90162 · doi:10.1023/A:1018358602892 [29] Platt, J.: Sequential minimal optimization: a fast algorithm for training support vector machines. In: Schölkopf, C.B.B., Smola, A. (eds.) Advances in Kernel Methods–Support Vector Learning, pp. 185–208. MIT, Cambridge (1998) [30] Powell, M.J.D.: On search directions for minimization algorithms. Math. Program. 4, 193–201 (1973) · Zbl 0258.90043 · doi:10.1007/BF01584660 [31] Tseng, P.: Decomposition algorithms for convex differentiable minimization. J. Optim. Theory Appl. 70, 109–135 (1991) · Zbl 0739.90052 · doi:10.1007/BF00940507 [32] Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995) · Zbl 0833.62008 [33] Ziemba, W.T., Parkan, C., Brooks-Hill, R.: Calculation of investment portfolios with risk free borrowing and lending. Manag. Sci. 21, 209–222 (1974) · Zbl 0294.90004 · doi:10.1287/mnsc.21.2.209
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.