# zbMATH — the first resource for mathematics

SVM via saddle point optimization: new bounds and distributed algorithms. (English) Zbl 07238980
Eppstein, David (ed.), 16th Scandinavian symposium and workshops on algorithm theory. SWAT 2018, June 18–20, 2018, Malmö University, Malmö, Sweden. Wadern: Schloss Dagstuhl – Leibniz Zentrum für Informatik (ISBN 978-3-95977-068-2). LIPIcs – Leibniz International Proceedings in Informatics 101, Article 25, 13 p. (2018).
Summary: We study two important SVM variants: hard-margin SVM (for linearly separable cases) and $$\nu$$-SVM (for linearly non-separable cases). We propose new algorithms from the perspective of saddle point optimization. Our algorithms achieve $$(1-\epsilon)$$-approximations with running time $$\tilde{O}(nd+n\sqrt{d/\epsilon})$$ for both variants, where $$n$$ is the number of points and $$d$$ is the dimensionality. To the best of our knowledge, the current best algorithm for $$\nu$$-SVM is based on quadratic programming approach which requires $$\Omega(n^2 d)$$ time in worst case [T. Joachims, Making large-scale SVM learning practical. Techn. Rep., Technische Universität Dortmund. (1998; doi:10.17877/DE290R-5098); J.C. Platt, “Fast training of support vector machines using sequential minimal optimization”, in: Advances in kernel methods: support vector learning. Cambridge: MIT Press. 185–208 (1999)]. In the paper, we provide the first nearly linear time algorithm for $$\nu$$-SVM. The current best algorithm for hard margin SVM achieved by Gilbert algorithm [B. Gärtner and M. Jaggi, in: Proceedings of the 25th annual symposium on computational geometry, SCG 2009, Aarhus, Denmark, June 8–10, 2009. New York, NY: Association for Computing Machinery (ACM). 33–42 (2009; Zbl 1380.68396)] requires $$O(nd/\epsilon)$$ time. Our algorithm improves the running time by a factor of $$\sqrt{d}/\sqrt{\epsilon}$$. Moreover, our algorithms can be implemented in the distributed settings naturally. We prove that our algorithms require $$\tilde{O}(k(d +\sqrt{d/\epsilon}))$$ communication cost, where $$k$$ is the number of clients, which almost matches the theoretical lower bound. Numerical experiments support our theory and show that our algorithms converge faster on high dimensional, large and dense data sets, as compared to previous methods.
For the entire collection see [Zbl 1390.68019].
##### MSC:
 68Wxx Algorithms in computer science
##### Software:
Pegasos; Scikit; SVMlight
Full Text: