Clusterwise support vector linear regression. (English) Zbl 1443.90281
Summary: In clusterwise linear regression (CLR), the aim is to simultaneously partition data into a given number of clusters and to find regression coefficients for each cluster. In this paper, we propose a novel approach to model and solve the CLR problem. The main idea is to utilize the support vector machine (SVM) approach to model the CLR problem by using the SVM for regression to approximate each cluster. This new formulation of the CLR problem is represented as an unconstrained nonsmooth optimization problem, where we minimize a difference of two convex (DC) functions. To solve this problem, a method based on the combination of the incremental algorithm and the double bundle method for DC optimization is designed. Numerical experiments are performed to validate the reliability of the new formulation for CLR and the efficiency of the proposed method. The results show that the SVM approach is suitable for solving CLR problems, especially, when there are outliers in data.
 90C26 Nonconvex programming, global optimization 49J52 Nonsmooth analysis 62J05 Linear regression; mixed models 62H30 Classification and discrimination; cluster analysis (statistical aspects) 65K05 Numerical mathematical programming methods
Algorithm 39; CRIO; flexmix; LDGB; SVMTorch; UCI-ml
