Machine learning-based surrogate modeling for data-driven optimization: a comparison of subset selection for regression techniques.

*(English)*Zbl 1444.90119Summary: Optimization of simulation-based or data-driven systems is a challenging task, which has attracted significant attention in the recent literature. A very efficient approach for optimizing systems without analytical expressions is through fitting surrogate models. Due to their increased flexibility, nonlinear interpolating functions, such as radial basis functions and Kriging, have been predominantly used as surrogates for data-driven optimization; however, these methods lead to complex nonconvex formulations. Alternatively, commonly used regression-based surrogates lead to simpler formulations, but they are less flexible and inaccurate if the form is not known a priori. In this work, we investigate the efficiency of subset selection regression techniques for developing surrogate functions that balance both accuracy and complexity. Subset selection creates sparse regression models by selecting only a subset of original features, which are linearly combined to generate a diverse set of surrogate models. Five different subset selection techniques are compared with commonly used nonlinear interpolating surrogate functions with respect to optimization solution accuracy, computation time, sampling requirements, and model sparsity. Our results indicate that subset selection-based regression functions exhibit promising performance when the dimensionality is low, while interpolation performs better for higher dimensional problems.

##### MSC:

90C59 | Approximation methods and heuristics in mathematical programming |

##### Keywords:

machine learning; surrogate modeling; black-box optimization; data-driven optimization; subset selection for regression
PDF
BibTeX
XML
Cite

\textit{S. H. Kim} and \textit{F. Boukouvala}, Optim. Lett. 14, No. 4, 989--1010 (2020; Zbl 1444.90119)

Full Text:
DOI

##### References:

[1] | Boukouvala, F.; Floudas, CA, ARGONAUT: AlgoRithms for Global Optimization of coNstrAined grey-box compUTational problems, Optim. Lett., 11, 5, 895-913 (2017) · Zbl 1373.90113 |

[2] | Cozad, A.; Sahinidis, NV; Miller, DC, Learning surrogate models for simulation-based optimization, AIChE J., 60, 6, 2211-2227 (2014) |

[3] | Amaran, S., Simulation optimization: a review of algorithms and applications, 4OR, 12, 4, 301-333 (2014) · Zbl 1317.90002 |

[4] | Tekin, E.; Sabuncuoglu, I., Simulation optimization: a comprehensive review on theory and applications, IIE Trans., 36, 11, 1067-1081 (2004) |

[5] | Bhosekar, A.; Ierapetritou, M., Advances in surrogate based modeling, feasibility analysis, and optimization: a review, Comput. Chem. Eng., 108, 250-267 (2018) |

[6] | Bajaj, I.; Iyer, SS; Faruque Hasan, MM, A trust region-based two phase algorithm for constrained black-box and grey-box optimization with infeasible initial point, Comput. Chem. Eng., 116, 306-321 (2017) |

[7] | Forrester, AIJ; Keane, AJ, Recent advances in surrogate-based optimization, Prog. Aerosp. Sci., 45, 1, 50-79 (2009) |

[8] | Jakobsson, S., A method for simulation based optimization using radial basis functions, Optim. Eng., 11, 4, 501-532 (2010) · Zbl 1243.65068 |

[9] | Boukouvala, F.; Muzzio, FJ; Ierapetritou, MG, Dynamic data-driven modeling of pharmaceutical processes, Ind. Eng. Chem. Res., 50, 11, 6743-6754 (2011) |

[10] | Bittante, A.; Pettersson, F.; Saxén, H., Optimization of a small-scale LNG supply chain, Energy, 148, 79-89 (2018) |

[11] | Sampat, AM, Optimization formulations for multi-product supply chain networks, Comput. Chem. Eng., 104, 296-310 (2017) |

[12] | Beykal, B., Global optimization of grey-box computational systems using surrogate functions and application to highly constrained oil-field operations, Comput. Chem. Eng., 114, 99-110 (2018) |

[13] | Ciaurri, D.E., Mukerji, T., Durlofsky, L.J.: Derivative-free optimization for oil field operations, in computational optimization and applications in engineering and industry. In: Yang, X.-S., Koziel, S. (eds.), pp. 19-55 Springer, Berlin (2011) |

[14] | Jansen, JD; Durlofsky, LJ, Use of reduced-order models in well control optimization, Optim. Eng., 18, 1, 105-132 (2017) · Zbl 1364.90401 |

[15] | Isebor, OJ; Durlofsky, LJ; Echeverría Ciaurri, D., A derivative-free methodology with local and global search for the constrained joint optimization of well locations and controls, Comput. Geosci., 18, 3, 463-482 (2014) |

[16] | Khoury, George A.; Smadbeck, James; Kieslich, Chris A.; Koskosidis, Alexandra J.; Guzman, Yannis A.; Tamamis, Phanourios; Floudas, Christodoulos A., Princeton_TIGRESS 2.0: High refinement consistency and net gains through support vector machines and molecular dynamics in double-blind predictions during the CASP11 experiment, Proteins: Structure, Function, and Bioinformatics, 85, 6, 1078-1098 (2017) |

[17] | Liwo, A., Protein structure prediction by global optimization of a potential energy function, Proc. Natl. Acad. Sci., 96, 10, 5482 (1999) |

[18] | DiMaio, F., Improved molecular replacement by density- and energy-guided protein structure optimization, Nature, 473, 540 (2011) |

[19] | Wang, C., An evaluation of adaptive surrogate modeling based optimization with two benchmark problems, Environ. Model Softw., 60, 167-179 (2014) |

[20] | Fen, C-S; Chan, C.; Cheng, H-C, Assessing a response surface-based optimization approach for soil vapor extraction system design, J. Water Resour. Plan. Manag., 135, 3, 198-207 (2009) |

[21] | Jones, DR, A taxonomy of global optimization methods based on response surfaces, J. Glob. Optim., 21, 4, 345-383 (2001) · Zbl 1172.90492 |

[22] | Palmer, K.; Realff, M., Metamodeling approach to optimization of steady-state flowsheet simulations: model generation, Chem. Eng. Res. Des., 80, 7, 760-772 (2002) |

[23] | Anand, P.; Siva Prasad, BVN; Venkateswarlu, CH, Modeling and optimization of a pharmaceutical formulation system using radial basis function network, Int. J. Neural Syst., 19, 2, 127-136 (2009) |

[24] | Jeong, S.; Murayama, M.; Yamamoto, K., Efficient optimization design method using Kriging model, J. Aircr., 42, 413-420 (2005) |

[25] | Miller, AJ, Selection of subsets of regression variables, J. R. Stat. Soc. Ser. A (General), 147, 3, 389-425 (1984) · Zbl 0584.62106 |

[26] | Candès, EJ; Romberg, JK; Tao, T., Stable signal recovery from incomplete and inaccurate measurements, Commun. Pure Appl. Math., 59, 8, 1207-1223 (2006) · Zbl 1098.94009 |

[27] | Guyon, I., Gene Selection for cancer classification using support vector machines, Mach. Learn., 46, 1, 389-422 (2002) · Zbl 0998.68111 |

[28] | Feng, G., Feature subset selection using naive Bayes for text classification, Pattern Recogn. Lett., 65, 109-115 (2015) |

[29] | Wright, J., Robust face recognition via sparse representation, IEEE Trans. Pattern Anal. Mach. Intell., 31, 2, 210-227 (2009) |

[30] | Sahinidis, Nick, The ALAMO approach to machine learning, Computer Aided Chemical Engineering, 2410 (2016) |

[31] | Cozad, A.; Sahinidis, N.; Miller, D., A combined first-principles and data-driven approach to model building, Comput. Chem. Eng., 73, 116-127 (2015) |

[32] | Jones, DR; Schonlau, M.; Welch, WJ, Efficient global optimization of expensive black-box functions, J. Glob. Optim., 13, 4, 455-492 (1998) · Zbl 0917.90270 |

[33] | Regis, RG; Shoemaker, CA, Constrained global optimization of expensive black box functions using radial basis functions, J. Glob. Optim., 31, 1, 153-171 (2005) · Zbl 1274.90511 |

[34] | Gorissen, D., A surrogate modeling and adaptive sampling toolbox for computer based design, J. Mach. Learn. Res., 11, 2051-2055 (2010) |

[35] | Tawarmalani, M.; Sahinidis, NV, A polyhedral branch-and-cut approach to global optimization, Math. Program., 103, 2, 225-249 (2005) · Zbl 1099.90047 |

[36] | Hastie, T.; Tibshirani, R.; Wainwright, M., Statistical Learning with Sparsity (2015), New York: Chapman and Hall, New York · Zbl 1319.68003 |

[37] | Ren, H.: Greedy vs. L1 Convex Optimization in Sparse Coding: Comparative Study in Abnormal Event Detection (2015) |

[38] | Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B (Methodol.), 58, 1, 267-288 (1996) · Zbl 0850.62538 |

[39] | Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B (Stat. Methodol.), 67, 2, 301-320 (2005) · Zbl 1069.62054 |

[40] | Hastie, T., Qian, J.: Glmnet Vignette (2014). [cited 2018; https://web.stanford.edu/ hastie/glmnet/glmnet_alpha.html] |

[41] | Zou, H.; Hastie, T.; Tibshirani, R., Sparse principal component analysis, J. Comput. Graph. Stat., 15, 2, 265-286 (2006) |

[42] | Kawano, S., Sparse principal component regression with adaptive loading, Comput. Stat. Data Anal., 89, 192-203 (2015) · Zbl 06921438 |

[43] | Geladi, P.; Kowalski, BR, Partial least-squares regression: a tutorial, Anal. Chim. Acta, 185, 1-17 (1986) |

[44] | Chun, H.; Keleş, S., Sparse partial least squares regression for simultaneous dimension reduction and variable selection, J. R. Stat. Soc. Ser. B Stat. Methodol., 72, 1, 3-25 (2010) · Zbl 1411.62184 |

[45] | Guyon, I.; Elisseeff, A., An introduction to variable and feature selection, J. Mach. Learn. Res., 3, 1157-1182 (2003) · Zbl 1102.68556 |

[46] | Smola, AJ; Schölkopf, B., A tutorial on support vector regression, Stat. Comput., 14, 3, 199-222 (2004) |

[47] | Cherkassky, V.; Ma, Y., Practical selection of SVM parameters and noise estimation for SVM regression, Neural Netw., 17, 1, 113-126 (2004) · Zbl 1075.68632 |

[48] | Boukouvala, F.; Hasan, MMF; Floudas, CA, Global optimization of general constrained grey-box models: new method and its application to constrained PDEs for pressure swing adsorption, J. Global Optim., 67, 1, 3-42 (2017) · Zbl 1359.90101 |

[49] | Friedman, J.H., et al.: Package ‘glmnet’: lasso and elastic-net regularized generalized linear models (2018). https://cran.r-project.org/web/packages/glmnet/glmnet.pdf. Accessed 1 May 2018 |

[50] | Zou, H.: Package ‘elasticnet’: elastic-net for sparse estimation and sparse PCA (2015). https://cran.r-project.org/web/packages/elasticnet/elasticnet.pdf. Accessed 1 May 2018 |

[51] | Kawano, S.: Package ‘spcr’: sparse principal component regression (2016). https://cran.r-project.org/web/packages/spcr/spcr.pdf. Accessed 1 May 2018 |

[52] | Chung, D., Chun, H., Keleş, S.: An introduction to the ‘spls’ package, Version 1.0. (2018). https://cran.r-project.org/web/packages/spls/vignettes/spls-example.pdf. Accessed 1 May 2018 |

[53] | Karatzoglou, A., Smola, A.J., Hornik, K.: Package ‘kernlab’: kernel-based machine learning lab (2018). https://cran.r-project.org/web/packages/kernlab/kernlab.pdf. Accessed 1 May 2018 |

[54] | Kuhn, M.: Package ‘caret’: classification and regression training (2018). https://cran.r-project.org/web/packages/caret/caret.pdf. Accessed 1 May 2018 |

[55] | Drud, A.: CONOPT. [cited 2018; https://www.gams.com/latest/docs/S_CONOPT.html |

[56] | Rios, LM; Sahinidis, NV, Derivative-free optimization: a review of algorithms and comparison of software implementations, J. Glob. Optim., 56, 3, 1247-1293 (2013) · Zbl 1272.90116 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.