zbMATH — the first resource for mathematics

Nonsmooth optimization algorithm for solving clusterwise linear regression problems. (English) Zbl 1311.65067
Summary: Clusterwise linear regression consists of finding a number of linear regression functions each approximating a subset of the data. In this paper, the clusterwise linear regression problem is formulated as a nonsmooth nonconvex optimization problem and an algorithm based on an incremental approach and on the discrete gradient method of nonsmooth optimization is designed to solve it. This algorithm incrementally divides the whole dataset into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate good starting points for solving global optimization problems at each iteration of the incremental algorithm. The algorithm is compared with the multi-start Späth and the incremental algorithms on several publicly available datasets for regression analysis.

65K05 Numerical mathematical programming methods
62J05 Linear regression; mixed models
90C25 Convex programming
Algorithm 39; DGM; UCI-ml
Full Text: DOI
[1] Preda, C; Saporta, G, Clusterwise PLS regression on a stochastic process, Comput. Stat. Data Anal., 49, 99-108, (2005) · Zbl 1429.62299
[2] Wedel, M; Kistemaker, C, Consumer benefit segmentation using clusterwise linear regression, Int. J. Res. Mark., 6, 45-59, (1989)
[3] Späth, H, Algorithm 39: clusterwise linear regression, Computing, 22, 367-373, (1979) · Zbl 0387.65028
[4] Späth, H, Algorithm 48: a fast algorithm for clusterwise linear regression, Computing, 29, 175-181, (1981) · Zbl 0485.65030
[5] Gaffney, S., Smyth, P.: Trajectory clustering using mixtures of regression models. In: Chaudhuri, S., Madigan, D. (eds.) Proceedings of the ACM Conference on Knowledge Discovery and Data Mining, New York, pp. 63-72 (1999) · Zbl 1429.62299
[6] Zhang, B.: Regression clustering. In: Proceedings of the Third IEEE International Conference on Data Mining (ICDM03), pp. 451-458. IEEE Computer Society, Washington, DC (2003)
[7] DeSarbo, WS; Cron, WL, A maximum likelihood methodology for clusterwise linear regression, J. Classif., 5, 249-282, (1988) · Zbl 0692.62052
[8] Garcìa-Escudero, LA; Gordaliza, A; Mayo-Iscar, A; San Martin, R, Robust clusterwise linear regression through trimming, Comput. Stat. Data Anal., 54, 3057-3069, (2010) · Zbl 1284.62198
[9] DeSarbo, WS; Oliver, RL; Rangaswamy, A, A simulated annealing methodology for clusterwise linear regression, Psychometrika, 54, 707-736, (1989)
[10] Carbonneau, RA; Caporossi, G; Hansen, P, Globally optimal clusterwise regression by mixed logical-quadratic programming, Eur. J. Oper. Res., 212, 213-222, (2011)
[11] Caporossi, G., Hansen, P.: Variable neighborhood search for least squares clusterwise regression. Technical Report, G-2005-61, Les Cahiers du GERAD, Montreal (2005)
[12] Bagirov, AM; Ugon, J; Mirzayeva, H, Nonsmooth nonconvex optimization approach to clusterwise linear regression problems, Eur. J. Oper. Res., 229, 132-142, (2013) · Zbl 1317.90242
[13] Bagirov, AM, Continuous subdifferential approximations and their applications, J. Math. Sci., 115, 2567-2609, (2003) · Zbl 1039.49020
[14] Bagirov, AM; Karasozen, B; Sezer, M, Discrete gradient method: derivative-free method for nonsmooth optimization, J. Optim. Theory Appl., 137, 317-334, (2008) · Zbl 1165.90021
[15] Clarke, F.H.: Optimization and Nonsmooth Analysis. John Wiley, New York (1983) · Zbl 0582.49001
[16] Bagirov, AM; Ugon, J, Piecewise partially separable functions and a derivative-free algorithm for large scale nonsmooth optimization, J. Glob. Optim., 35, 163-195, (2006) · Zbl 1136.90515
[17] Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006) · Zbl 1104.65059
[18] Bache, K., Lichman, M.: UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA (2013). http://archive.ics.uci.edu/ml
[19] Yeh, I-Cheng, Modeling slump flow of concrete using second-order regressions and artificial neural networks, Cem. Concr. Compos., 29, 474-480, (2007)
[20] Cortez, P., Morais, A.: A Data mining approach to predict forest fires using meteorological data. In: Neves, J., Santos, M.F., Machado, J. (eds.) New Trends in Artificial Intelligence, Proceedings of the 13th EPIA 2007-Portuguese Conference on Artificial Intelligence, Guimaraes, pp. 512-523 (2007) · Zbl 0387.65028
[21] Yeh, I-Cheng, Modeling of strength of high performance concrete using artificial neural networks, Cem. Concr. Res., 28, 1797-1808, (1998)
[22] Cortez, P; Cerdeira, A; Almeida, F; Matos, T; Reis, J, Modeling wine preferences by data mining from physicochemical properties, Decis. Support Syst., 47, 547-553, (2009)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.