×

An efficient optimization approach for best subset selection in linear regression, with application to model selection and fitting in autoregressive time-series. (English) Zbl 1435.90088

Summary: In this paper, we consider two relevant optimization problems: the problem of selecting the best sparse linear regression model and the problem of optimally identifying the parameters of auto-regressive models based on time series data. Usually these problems, which although different are indeed related, are solved through a sequence of separate steps, alternating between choosing a subset of features and then finding a best fit regression. In this paper we propose to model both problems as mixed integer non linear optimization ones and propose numerical procedures based on state of the art optimization tools in order to solve both of them. The proposed approach has the advantage of considering both model selection as well as parameter estimation as a single optimization problem. Numerical experiments performed on widely available datasets as well as on synthetic ones confirm the high quality of our approach, both in terms of the quality of the resulting models and in terms of CPU time.

MSC:

90C11 Mixed integer programming
62J05 Linear regression; mixed models
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Akaike, H., A new look at the statistical model identification, IEEE Trans. Autom. Control, 19, 716-723 (1974) · Zbl 0314.62039 · doi:10.1109/TAC.1974.1100705
[2] Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Selected Papers of Hirotugu Akaike, pp. 199-213. Springer (1998)
[3] Anderson, D.A., Burnham, K.P., Anderson, D., Burnham, K.P.: Model Selection and Inference: A Practical Information-Theoretic Approach. Springer, New York (1998) · Zbl 0920.62006
[4] Bagirov, A.; Clausen, C.; Kohler, M., An algorithm for the estimation of a regression function by continuous piecewise linear functions, Comput. Optim. Appl., 45, 159-179 (2010) · Zbl 1187.90267 · doi:10.1007/s10589-008-9174-9
[5] Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Nashua (2016) · Zbl 1360.90236
[6] Bertsimas, D.; King, A., Or forum-an algorithmic approach to linear regression, Oper. Res., 64, 2-16 (2015) · Zbl 1338.90272 · doi:10.1287/opre.2015.1436
[7] Bertsimas, D.; King, A.; Mazumder, R.; etal., Best subset selection via a modern optimization lens, Ann. Stat., 44, 813-852 (2016) · Zbl 1335.62115 · doi:10.1214/15-AOS1388
[8] Bertsimas, D.; Shioda, R., Algorithm for cardinality-constrained quadratic optimization, Comput. Optim. Appl., 43, 1-22 (2009) · Zbl 1178.90262 · doi:10.1007/s10589-007-9126-9
[9] Bertsimas, D., Van Parys, B.: Sparse high-dimensional regression: exact scalable algorithms and phase transitions. arXiv preprint arXiv:1709.10029, (2017) · Zbl 1444.62094
[10] Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis: Forecasting and Control. Wiley, Hoboken (2015) · Zbl 1317.62001
[11] Bozdogan, H., Akaike’s information criterion and recent developments in information complexity, J. Math. Psychol., 44, 62-91 (2000) · Zbl 1047.62501 · doi:10.1006/jmps.1999.1277
[12] Byrd, RH; Lu, P.; Nocedal, J.; Zhu, C., A limited memory algorithm for bound constrained optimization, SIAM J. Sci. Stat. Comput., 16, 1190-1208 (1995) · Zbl 0836.65080 · doi:10.1137/0916069
[13] Cozad, A.; Sahinidis, NV; Miller, DC, Learning surrogate models for simulation-based optimization, AIChE J., 60, 2211-2227 (2014) · doi:10.1002/aic.14418
[14] Livera, AM; Hyndman, RJ; Snyder, RD, Forecasting time series with complex seasonal patterns using exponential smoothing, J. Am. Stat. Assoc., 106, 1513-1527 (2011) · Zbl 1234.62123 · doi:10.1198/jasa.2011.tm09771
[15] Dolan, ED; Moré, JJ, Benchmarking optimization software with performance profiles, Math. Program., 91, 201-213 (2002) · Zbl 1049.90004 · doi:10.1007/s101070100263
[16] Dua, D., Graff, C.: UCI machine learning repository. 2017. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml
[17] Efroymson, M.: Multiple regression analysis. Mathematical methods for digital computers, pp. 191-203 (1960)
[18] Garside, M., Some computational procedures for the best subset problem, J. Roy. Stat. Soc.: Ser. C (Appl. Stat.), 20, 8-15 (1971)
[19] Gómez, A., Prokopyev, O.: A mixed-integer fractional optimization approach to best subset selection. Technical Report Optimization On Line, 6795, Swanson School of Engineering, University of Pittsburgh, (2018) · Zbl 07362333
[20] Gurobi Optimization LLC. Gurobi optimizer reference manual, (2018). http://www.gurobi.com
[21] Hamilton, J.D.: Time Series Analysis, vol. 2. Princeton University Press, Princeton, NJ (1994) · Zbl 0831.62061
[22] Hannan, EJ; Quinn, BG, The determination of the order of an autoregression, J. Roy. Stat. Soc.: Ser. B (Methodol.), 41, 190-195 (1979) · Zbl 0408.62076
[23] Hyndman, R.J., Khandakar, Y. et al.: Automatic time series for forecasting: the forecast package for R. Number 6/07. Monash University, Department of Econometrics and Business Statistics, (2007)
[24] Kimura, K.; Waki, H., Minimization of Akaike’s information criterion in linear regression analysis via mixed integer nonlinear program, Optim. Methods Softw., 33, 633-649 (2018) · Zbl 1398.90111 · doi:10.1080/10556788.2017.1333611
[25] Konishi, S., Kitagawa, G.: Information Criteria and Statistical Modeling. Springer, New York (2008) · Zbl 1172.62003 · doi:10.1007/978-0-387-71887-3
[26] Kwiatkowski, D.; Phillips, PC; Schmidt, P.; Shin, Y., Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root?, J. Econ., 54, 159-178 (1992) · Zbl 0871.62100 · doi:10.1016/0304-4076(92)90104-Y
[27] Medeiros, MC; Resende, MG; Veiga, A., Piecewise linear time series estimation with GRASP, Comput. Optim. Appl., 19, 127-144 (2001) · Zbl 1168.90590 · doi:10.1023/A:1011238718363
[28] Miller, A.: Subset Selection in Regression. Chapman and Hall/CRC, London (2002) · Zbl 1051.62060 · doi:10.1201/9781420035933
[29] Miyashiro, R.; Takano, Y., Mixed integer second-order cone programming formulations for variable selection in linear regression, Eur. J. Oper. Res., 247, 721-731 (2015) · Zbl 1346.90616 · doi:10.1016/j.ejor.2015.06.081
[30] Miyashiro, R.; Takano, Y., Subset selection by Mallows’ Cp: a mixed integer programming approach, Expert Syst. Appl., 42, 325-331 (2015) · doi:10.1016/j.eswa.2014.07.056
[31] Natarajan, BK, Sparse approximate solutions to linear systems, SIAM J. Comput., 24, 227-234 (1995) · Zbl 0827.68054 · doi:10.1137/S0097539792240406
[32] Osborn, DR; Chui, AP; Smith, JP; Birchenhall, CR, Seasonality and the order of integration for consumption, Oxf. Bull. Econ. Stat., 50, 361-377 (1988) · doi:10.1111/j.1468-0084.1988.mp50004002.x
[33] Sato, T.; Takano, Y.; Miyashiro, R.; Yoshise, A., Feature subset selection for logistic regression via mixed integer optimization, Comput. Optim. Appl., 64, 865-880 (2016) · Zbl 1352.90068 · doi:10.1007/s10589-016-9832-2
[34] Schwarz, G., Estimating the dimension of a model, Ann. Stat., 6, 461-464 (1978) · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[35] Shen, X.; Pan, W.; Zhu, Y.; Zhou, H., On constrained and regularized high-dimensional regression, Ann. Inst. Stat. Math., 65, 807-832 (2013) · Zbl 1329.62307 · doi:10.1007/s10463-012-0396-3
[36] Shibata, R., Selection of the order of an autoregressive model by Akaike’s information criterion, Biometrika, 63, 117-126 (1976) · Zbl 0358.62048 · doi:10.1093/biomet/63.1.117
[37] Team, R.C. et al.: R: A language and environment for statistical computing (2013)
[38] Tibshirani, R., Regression shrinkage and selection via the Lasso, J. R. Stat. Soc. Ser. B Methodol. 2, 58, 267-288 (1996) · Zbl 0850.62538
[39] Wilson, ZT; Sahinidis, NV, The ALAMO approach to machine learning, Comput. Chem. Eng., 106, 785-795 (2017) · doi:10.1016/j.compchemeng.2017.02.010
[40] Zheng, Z.; Fan, Y.; Lv, J., High dimensional thresholded regression and shrinkage effect, J. R. Stat. Soc. Ser. B Stat. Methodol., 76, 627-649 (2014) · Zbl 1411.62049 · doi:10.1111/rssb.12037
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.