×

On the choice of the low-dimensional domain for global optimization via random embeddings. (English) Zbl 1440.90051

Summary: The challenge of taking many variables into account in optimization problems may be overcome under the hypothesis of low effective dimensionality. Then, the search of solutions can be reduced to the random embedding of a low dimensional space into the original one, resulting in a more manageable optimization problem. Specifically, in the case of time consuming black-box functions and when the budget of evaluations is severely limited, global optimization with random embeddings appears as a sound alternative to random search. Yet, in the case of box constraints on the native variables, defining suitable bounds on a low dimensional domain appears to be complex. Indeed, a small search domain does not guarantee to find a solution even under restrictive hypotheses about the function, while a larger one may slow down convergence dramatically. Here we tackle the issue of low-dimensional domain selection based on a detailed study of the properties of the random embedding, giving insight on the aforementioned difficulties. In particular, we describe a minimal low-dimensional set in correspondence with the embedded search space. We additionally show that an alternative equivalent embedding procedure yields simultaneously a simpler definition of the low-dimensional minimal set and better properties in practice. Finally, the performance and robustness gains of the proposed enhancements for Bayesian optimization are illustrated on numerical examples.

MSC:

90C26 Nonconvex programming, global optimization
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Binois, M.: Uncertainty quantification on Pareto fronts and high-dimensional strategies in Bayesian optimization, with applications in multi-objective automotive design. Ph.D. thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne (2015)
[2] Binois, Mickaël; Ginsbourger, David; Roustant, Olivier, A Warped Kernel Improving Robustness in Bayesian Optimization Via Random Embeddings, Lecture Notes in Computer Science, 281-286 (2015), Cham: Springer International Publishing, Cham
[3] Carpentier, A., Munos, R.: Bandit theory meets compressed sensing for high dimensional stochastic linear bandit. In: International Conference on Artificial Intelligence and Statistics (2012)
[4] Černỳ, M., Goffin’s algorithm for zonotopes, Kybernetika, 48, 5, 890-906 (2012) · Zbl 1269.90141
[5] Chen, B., Castro, R., Krause, A.: Joint optimization and variable selection of high-dimensional Gaussian processes. In: Proceedings of International Conference on Machine Learning (ICML) (2012)
[6] Chen, Y., Hoffman, M.W., Colmenarejo, S.G., Denil, M., Lillicrap, T.P., de Freitas, N.: Learning to learn for global optimization of black box functions. arXiv preprint arXiv:1611.03824 (2016)
[7] Constantine, Pg; Dow, E.; Wang, Q., Active subspace methods in theory and practice: applications to kriging surfaces, SIAM J. Sci. Comput., 36, 4, A1500-A1524 (2014) · Zbl 1311.65008 · doi:10.1137/130916138
[8] Courrier, N.; Boucard, Pa; Soulier, B., Variable-fidelity modeling of structural analysis of assemblies, J. Glob. Optim., 64, 3, 577-613 (2016) · Zbl 1401.90272 · doi:10.1007/s10898-015-0345-9
[9] Dixon, L.; Szegö, G., The global optimization problem: an introduction, Towards Glob. Optim., 2, 1-15 (1978)
[10] Djolonga, J., Krause, A., Cevher, V.: High-dimensional Gaussian process bandits. In: Advances in Neural Information Processing Systems, pp. 1025-1033 (2013)
[11] Donoho, D.L.: High-dimensional data analysis: the curses and blessings of dimensionality. In: AMS Math Challenges Lecture pp. 1-32 (2000)
[12] Durrande, N.: Étude de classes de noyaux adaptées à la simplification et à linterprétation des modèles dapproximation. une approche fonctionnelle et probabiliste. Ph.D. thesis, Saint-Etienne, EMSE (2011)
[13] Durrande, N.; Ginsbourger, D.; Roustant, O., Additive kernels for Gaussian process modeling, Annales de la Facultée de Sciences de Toulouse, 21, 3, 481-499 (2012) · Zbl 1266.60068 · doi:10.5802/afst.1342
[14] Duvenaud, D.K.: Automatic model construction with Gaussian processes. Ph.D. thesis, University of Cambridge (2014)
[15] Feliot, P.; Bect, J.; Vazquez, E., A Bayesian approach to constrained single-and multi-objective optimization, J. Glob. Optim., 67, 1-37 (2015)
[16] Filliman, P., Extremum problems for zonotopes, Geometriae Dedicata, 27, 3, 251-262 (1988) · Zbl 0651.52011 · doi:10.1007/BF00181491
[17] Franey, M.; Ranjan, P.; Chipman, H., Branch and bound algorithms for maximizing expected improvement functions, J. Stat. Plan. Inference, 141, 1, 42-55 (2011) · Zbl 1197.62117 · doi:10.1016/j.jspi.2010.05.011
[18] Gardner, J., Guo, C., Weinberger, K., Garnett, R., Grosse, R.: Discovering and exploiting additive structure for Bayesian optimization. In: Artificial Intelligence and Statistics, pp. 1311-1319 (2017)
[19] Garnett, R., Osborne, M., Hennig, P.: Active learning of linear embeddings for Gaussian processes. In: 30th Conference on Uncertainty in Artificial Intelligence (UAI 2014), pp. 230-239. AUAI Press (2014)
[20] Gutmann, Hm, A radial basis function method for global optimization, J. Glob. Optim., 19, 3, 201-227 (2001) · Zbl 0972.90055 · doi:10.1023/A:1011255519438
[21] Hastie, T.; Tibshirani, R.; Friedman, J.; Franklin, J., The elements of statistical learning: data mining, inference and prediction, Math. Intell., 27, 2, 83-85 (2005)
[22] Hennig, P.; Schuler, Cj, Entropy search for information-efficient global optimization, J. Mach. Learn. Res., 98888, 1809-1837 (2012) · Zbl 1432.65073
[23] Huang, D.; Allen, Tt; Notz, Wi; Zeng, N., Global optimization of stochastic black-box systems via sequential kriging meta-models, J. Glob. Optim., 34, 3, 441-466 (2006) · Zbl 1098.90097 · doi:10.1007/s10898-005-2454-3
[24] Hutter, Frank; Hoos, Holger H.; Leyton-Brown, Kevin, Sequential Model-Based Optimization for General Algorithm Configuration, Lecture Notes in Computer Science, 507-523 (2011), Berlin, Heidelberg: Springer Berlin Heidelberg, Berlin, Heidelberg
[25] Iooss, B.; Lemaître, P.; Meloni, C.; Dellino, G., A review on global sensitivity analysis methods, Uncertainty Management in Simulation-Optimization of Complex Systems: Algorithms and Applications (2015), Berlin: Springer, Berlin · Zbl 1332.90003
[26] Ivanov, M.; Kuhnt, S., A parallel optimization algorithm based on FANOVA decomposition, Qual. Reliabil. Eng. Int., 30, 7, 961-974 (2014) · doi:10.1002/qre.1710
[27] Jones, D.; Schonlau, M.; Welch, W., Efficient global optimization of expensive black-box functions, J. Glob. Optim., 13, 4, 455-492 (1998) · Zbl 0917.90270 · doi:10.1023/A:1008306431147
[28] Kandasamy, K., Schneider, J., Póczos, B.: High dimensional Bayesian optimisation and bandits via additive models. In: Proceedings of The 32nd International Conference on Machine Learning, pp. 295-304 (2015)
[29] Krein, M.; Milman, D., On extreme points of regular convex sets, Studia Mathematica, 9, 1, 133-138 (1940) · Zbl 0063.03360 · doi:10.4064/sm-9-1-133-138
[30] Krityakierne, T.; Akhtar, T.; Shoemaker, Ca, SOP: parallel surrogate global optimization with Pareto center selection for computationally expensive single objective problems, J. Glob. Optim., 66, 1-21 (2016) · Zbl 1356.90136 · doi:10.1007/s10898-016-0407-7
[31] Laguna, M.; Martí, R., Experimental testing of advanced scatter search designs for global optimization of multimodal functions, J. Glob. Optim., 33, 2, 235-255 (2005) · Zbl 1093.90092 · doi:10.1007/s10898-004-1936-z
[32] Le, V.T.H., Stoica, C., Alamo, T., Camacho, E.F., Dumur, D.: Uncertainty representation based on set theory. Zonotopes, pp. 1-26 (2013)
[33] Li, C.L., Kandasamy, K., Póczos, B., Schneider, J.: High dimensional Bayesian optimization via restricted projection pursuit models. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, pp. 884-892 (2016)
[34] Liu, B.; Zhang, Q.; Gielen, Gg, A Gaussian process surrogate model assisted evolutionary algorithm for medium scale expensive optimization problems, IEEE Trans. Evolut. Comput., 18, 2, 180-192 (2014) · doi:10.1109/TEVC.2013.2248012
[35] Mardia, Kv; Kent, Jt; Bibby, Jm, Multivariate Analysis (Probability and Mathematical Statistics) (1980), Cambridge: Academic Press, Cambridge
[36] Mathar, R.; Zilinskas, A., A class of test functions for global optimization, J. Glob. Optim., 5, 2, 195-199 (1994) · Zbl 0809.90119 · doi:10.1007/BF01100693
[37] Mcmullen, P., On zonotopes, Trans. Am. Math. Soc., 159, 91-109 (1971) · Zbl 0223.52007 · doi:10.1090/S0002-9947-1971-0279689-2
[38] Meyer, Cd, Matrix Analysis and Applied Linear Algebra (2000), Philadelphia: Siam, Philadelphia · Zbl 0962.15001
[39] Mishra, S., Global Optimization by Differential Evolution and Particle Swarm Methods: Evaluation on Some Benchmark Functions (2006), Tech. rep.: University Library of Munich, Germany, Tech. rep.
[40] Mockus, J.; Tiesis, V.; Zilinskas, A., The application of Bayesian methods for seeking the extremum, Towards Glob. Optim., 2, 117-129, 2 (1978) · Zbl 0394.90090
[41] Morris, Md; Mitchell, Tj; Ylvisaker, D., Bayesian design and analysis of computer experiments: use of derivatives in surface prediction, Technometrics, 35, 3, 243-255 (1993) · Zbl 0785.62025 · doi:10.1080/00401706.1993.10485320
[42] Neal, Radford M., Bayesian Learning for Neural Networks (1996), New York, NY: Springer New York, New York, NY · Zbl 0888.62021
[43] Nguyen, Hh; Vu, V., Random matrices: law of the determinant, Ann. Probab., 42, 1, 146-167 (2014) · Zbl 1299.60005 · doi:10.1214/12-AOP791
[44] Oh, C., Gavves, E., Welling, M.: BOCK: Bayesian optimization with cylindrical kernels. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. In: Proceedings of Machine Learning Research, vol. 80, pp. 3868-3877. PMLR, Stockholmsmssan, Stockholm Sweden (2018). http://proceedings.mlr.press/v80/oh18a.html
[45] Qian, H., Hu, Y.Q., Yu, Y.: Derivative-free optimization of high-dimensional non-convex functions by sequential random embeddings. In: IJCAI 2016 (2016)
[46] Rasmussen, Ce; Williams, C., Gaussian Processes for Machine Learning (2006), Cambridge: MIT Press, Cambridge · Zbl 1177.68165
[47] Rios, Lm; Sahinidis, Nv, Derivative-free optimization: a review of algorithms and comparison of software implementations, J. Glob. Optim., 56, 3, 1247-1293 (2013) · Zbl 1272.90116 · doi:10.1007/s10898-012-9951-y
[48] Rolland, P., Scarlett, J., Bogunovic, I., Cevher, V.: High-dimensional Bayesian optimization via additive models with overlapping groups. In: International Conference on Artificial Intelligence and Statistics, pp. 298-307 (2018)
[49] Roustant, O.; Ginsbourger, D.; Deville, Y., DiceKriging, DiceOptim: two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization, J. Stat. Softw., 51, 1, 1-55 (2012) · doi:10.18637/jss.v051.i01
[50] Salem, M.B., Bachoc, F., Roustant, O., Gamboa, F., Tomaso, L.: Sequential dimension reduction for learning features of expensive black-box functions (2018)
[51] Shahriari, B.; Swersky, K.; Wang, Z.; Adams, Rp; De Freitas, N., Taking the human out of the loop: a review of Bayesian optimization, Proc. IEEE, 104, 1, 148-175 (2016) · doi:10.1109/JPROC.2015.2494218
[52] Song, W.; Keane, Aj, Surrogate-based aerodynamic shape optimization of a civil aircraft engine nacelle, AIAA J., 45, 10, 2565-2574 (2007) · doi:10.2514/1.30015
[53] Turlach, B.A., Weingessel, A.: quadprog: Functions to solve Quadratic Programming Problems. (2013). https://CRAN.R-project.org/package=quadprog. R package version 1.5-5
[54] Vershynin, R.: Introduction to the non-asymptotic analysis of random matrices. arXiv preprint arXiv:1011.3027 (2010)
[55] Villemonteix, J.; Vazquez, E.; Sidorkiewicz, M.; Walter, E., Global optimization of expensive-to-evaluate functions: an empirical comparison of two sampling criteria, J. Glob. Optim., 43, 2, 373-389 (2009) · Zbl 1169.90460 · doi:10.1007/s10898-008-9313-y
[56] Viswanath, Asha; Forrester, A. I. J.; Keane, A. J., Dimension Reduction for Aerodynamic Design Optimization, AIAA Journal, 49, 6, 1256-1266 (2011) · doi:10.2514/1.J050717
[57] Wang, Z., Gehring, C., Kohli, P., Jegelka, S.: Batched large-scale Bayesian optimization in high-dimensional spaces. In: International Conference on Artificial Intelligence and Statistics (2018)
[58] Wang, Z.; Hutter, F.; Zoghi, M.; Matheson, D.; De Feitas, N., Bayesian optimization in a billion dimensions via random embeddings, J. Artif. Intell. Res. (JAIR), 55, 361-387 (2016) · Zbl 1358.90089 · doi:10.1613/jair.4806
[59] Wang, Z., Zoghi, M., Hutter, F., Matheson, D., de Freitas, N.: Bayesian optimization in high dimensions via random embeddings. In: IJCAI (2013) · Zbl 1358.90089
[60] Ziegler, Gm, Lectures on Polytopes (1995), Berlin: Springer, Berlin · Zbl 0823.52002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.