×

zbMATH — the first resource for mathematics

A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions. (English) Zbl 1416.62648
Summary: This tutorial introduces the reader to Gaussian process regression as an expressive tool to model, actively explore and exploit unknown functions. Gaussian process regression is a powerful, non-parametric Bayesian approach towards regression problems that can be utilized in exploration and exploitation scenarios. This tutorial aims to provide an accessible introduction to these techniques. We will introduce Gaussian processes which generate distributions over functions used for Bayesian non-parametric regression, and demonstrate their use in applications and didactic examples including simple regression problems, a demonstration of kernel-encoded prior assumptions and compositions, a pure exploration scenario within an optimal design framework, and a bandit-like exploration-exploitation scenario where the goal is to recommend movies. Beyond that, we describe a situation modelling risk-averse exploration in which an additional constraint (not to sample below a certain threshold) needs to be accounted for. Lastly, we summarize recent psychological experiments utilizing Gaussian processes. Software and literature pointers are also provided.

MSC:
62P15 Applications of statistics to psychology
60G15 Gaussian processes
62G08 Nonparametric regression and quantile regression
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Akaike, H., A new look at the statistical model identification, IEEE Transactions on Automatic Control, 19, 6, 716-723, (1974) · Zbl 0314.62039
[2] Berkenkamp, F., Krause, A., & Schoellig, A. P. (2016). Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics, arXiv preprint arXiv:1602.04450.
[3] Borji, A.; Itti, L., Bayesian optimization explains human active search, (Advances in neural information processing systems, (2013)), 55-63
[4] Cavagnaro, D. R., Aranovich, G. J., McClure, S. M., Pitt, M. A., & Myung, J. I. (2014). On the functional form of temporal discounting: An optimized adaptive test.
[5] Chapelle, O.; Li, L., An empirical evaluation of Thompson sampling, (Advances in neural information processing systems, (2011)), 2249-2257
[6] Cox, G.; Kachergis, G.; Shiffrin, R., Gaussian process regression for trajectory analysis, (Proceedings of the cognitive science society, vol. 34, (2012))
[7] de Freitas, N., Smola, A., & Zoghi, M. (2012). Regret Bounds for Deterministic Gaussian Process Bandits, arXiv preprint arXiv:1203.2177.
[8] Desautels, T. A.; Choe, J.; Gad, P.; Nandra, M. S.; Roy, R. R.; Zhong, H., An active learning algorithm for control of epidural electrostimulation, IEEE Transactions on Biomedical Engineering, 62, 10, 2443-2455, (2015)
[9] Durrleman, S.; Simon, R., Flexible regression models with cubic splines, Statistics in Medicine, 8, 5, 551-561, (1989)
[10] Duvenaud, D., Lloyd, J. R., Grosse, R., Tenenbaum, J. B., & Ghahramani, Z. (2013). Structure discovery in nonparametric regression through compositional kernel search, arXiv preprint arXiv:1302.4922.
[11] Engbert, R.; Kliegl, R., Microsaccades keep the eyes’ balance during fixation, Psychological Science, 15, 6, 431, (2004)
[12] Flaxman, S., Gelman, A., Neill, D., Smola, A., Vehtari, A., & Wilson, A. G. (2015). Fast hierarchical Gaussian processes.
[13] Freeman, J. B.; Ambady, N., Mousetracker: software for studying real-time mental processing using a computer mouse-tracking method, Behavior Research Methods, 42, 1, 226-241, (2010)
[14] Gershman, S. J.; Blei, D. M., A tutorial on Bayesian nonparametric models, Journal of Mathematical Psychology, 56, 1, 1-12, (2012) · Zbl 1237.62062
[15] Gershman, S. J.; Malmaud, J.; Tenenbaum, J. B.; Gershman, S., Structured representations of utility in combinatorial domains, (2016), Decision
[16] Gramacy, R. B. et al. (2007). tgp: an R package for Bayesian nonstationary, semiparametric nonlinear regression and design by treed Gaussian process models.
[17] Gramacy, R. B.; Apley, D. W., Local Gaussian process approximation for large computer experiments, Journal of Computational and Graphical Statistics, 24, 2, 561-578, (2014)
[18] Gramacy, R. B.; Lee, H. K., Bayesian treed Gaussian process models with an application to computer modeling, Journal of the American Statistical Association, 103, 483, (2008) · Zbl 1205.62218
[19] Hennig, P.; Osborne, M. A.; Girolami, M., Probabilistic numerics and uncertainty in computations, Proceedings of the Royal Society of London, Series A (Mathematical and Physical Sciences), 471, 2179, 20150142, (2015) · Zbl 1372.65010
[20] Hennig, P.; Schuler, C. J., Entropy search for information-efficient global optimization, Journal of Machine Learning Research (JMLR), 13, Jun, 1809-1837, (2012) · Zbl 1432.65073
[21] Jäkel, F.; Schölkopf, B.; Wichmann, F. A., A tutorial on kernel methods for categorization, Journal of Mathematical Psychology, 51, 6, 343-358, (2007) · Zbl 1207.68242
[22] Kac, M.; Siegert, A., An explicit representation of a stationary Gaussian process, The Annals of Mathematical Statistics, 438-442, (1947) · Zbl 0033.38501
[23] Katehakis, M. N.; Veinott, A. F., The multi-armed bandit problem: decomposition and computation, Mathematics of Operations Research, 12, 2, 262-268, (1987) · Zbl 0618.90097
[24] Kieslich, P. J.; Henniger, F., Mousetrap: an integrated, open-source mouse-tracking package, Behavioral Research Methods, (2017)
[25] Krause, A., Sfo: A toolbox for submodular function optimization, Journal of Machine Learning Research (JMLR), 11, Mar, 1141-1144, (2010) · Zbl 1242.68233
[26] Krause, A.; Golovin, D., Submodular function maximization, Tractability: Practical Approaches to Hard Problems, 3, 19, (2012)
[27] Krause, A.; Singh, A.; Guestrin, C., Near-optimal sensor placements in Gaussian processes: theory, efficient algorithms and empirical studies, Journal of Machine Learning Research (JMLR), 9, 235-284, (2008) · Zbl 1225.68192
[28] Lawrence, N., Seeger, M., & Herbrich, R. (2003). Fast sparse Gaussian process methods: The informative vector machine. In Proceedings of the 16th annual conference on neural information processing systems, no. EPFL-CONF-161319 (pp. 609-616).
[29] Lee, C. H., A phase space spline smoother for Fitting trajectories, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 34, 1, 346-356, (2004)
[30] Lloyd, J. R., Duvenaud, D., Grosse, R., Tenenbaum, J. B., & Ghahramani, Z. (2014). Automatic construction and natural-language description of nonparametric regression models, arXiv preprint arXiv:1402.4304.
[31] Lucas, C. G.; Griffiths, T. L.; Williams, J. J.; Kalish, M. L., A rational model of function learning, Psychonomic Bulletin & Review, 1-23, (2015)
[32] Matthews, A. G.d. G.; van der Wilk, M.; Nickson, T.; Fujii, K.; Boukouvalas, A.; León-Villagrá, P., Gpflow: A Gaussian process library using tensorflow, Journal of Machine Learning Research (JMLR), 18, 40, 1-6, (2017) · Zbl 1437.62127
[33] May, B. C.; Korda, N.; Lee, A.; Leslie, D. S., Optimistic Bayesian sampling in contextual-bandit problems, Journal of Machine Learning Research (JMLR), 13, Jun, 2069-2106, (2012) · Zbl 1435.62034
[34] Meder, B.; Nelson, J. D., Information search with situation-specific reward functions, Judgment and Decision Making, 7, 2, 119-148, (2012)
[35] Močkus, J., On Bayesian methods for seeking the extremum, (Optimization techniques IFIP technical conference, (1975), Springer), 400-404 · Zbl 0311.90042
[36] Myung, J. I.; Cavagnaro, D. R.; Pitt, M. A., A tutorial on adaptive design optimization, Journal of Mathematical Psychology, 57, 3, 53-67, (2013) · Zbl 1284.62478
[37] Myung, J. I.; Pitt, M. A., Optimal experimental design for model discrimination, Psychological Review, 116, 3, 499, (2009)
[38] Rahimi, A.; Recht, B., Random features for large-scale kernel machines, (Advances in neural information processing systems, (2007)), 1177-1184
[39] Rasmussen, C. E.; Nickisch, H., Gaussian processes for machine learning (GPML) toolbox, Journal of Machine Learning Research (JMLR), 11, Nov, 3011-3015, (2010) · Zbl 1242.68242
[40] Schulz, E., Huys, Q. J., Bach, D. R., Speekenbrink, M., & Krause, A. (2016). Better safe than sorry: Risky function exploitation through safe optimization, arXiv preprint arXiv:1602.01052.
[41] Schulz, E.; Konstantinidis, E.; Speekenbrink, M., Putting bandits into context: how function learning supports decision making, Journal of Experimental Psychology. Learning, Memory, and Cognition, (2017)
[42] Schulz, E.; Speekenbrink, M.; Hernández-Lobato, J. M.; Ghahramani, Z.; Gershman, S. J., Quantifying mismatch in Bayesian optimization, (Nips workshop on Bayesian optimization: Black-box optimization and beyond, (2016))
[43] Schulz, E.; Tenenbaum, J. B.; Duvenaud, D.; Speekenbrink, M.; Gershman, S. J., Probing the compositionality of intuitive functions, Advances in Neural Information Processing Systems, 29, (2016)
[44] Schulz, E.; Tenenbaum, J. B.; Reshef, D. N.; Speekenbrink, M.; Gershman, S. J., Assessing the perceived predictability of functions, (Proceedings of the thirty-seventh annual conference of the cognitive science society, (2015))
[45] Sheffield ML group, (2012). GPy: A Gaussian process framework in python. http://github.com/SheffieldML/GPy.
[46] Srinivas, N., Krause, A., Kakade, S. M., & Seeger, M. (2009). Gaussian process optimization in the bandit setting: No regret and experimental design, arXiv preprint arXiv:0912.3995. · Zbl 1365.94131
[47] Sui, Y., Gotovos, A., Burdick, J., & Krause, A. (2015). Safe exploration for optimization with Gaussian processes. In Proceedings of the 32nd international conference on machine learning (pp. 997-1005).
[48] Thompson, W. R., On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, 25, 3/4, 285-294, (1933) · JFM 59.1159.03
[49] Van Zandt, T.; Townsend, J. T., Designs for and analyses of response time experiments, The Oxford Handbook of Quantitative Methods: Foundations, 1, 260, (2014)
[50] Vanhatalo, J.; Riihimäki, J.; Hartikainen, J.; Jylänki, P.; Tolvanen, V.; Vehtari, A., Gpstuff: Bayesian modeling with Gaussian processes, Journal of Machine Learning Research (JMLR), 14, Apr, 1175-1179, (2013) · Zbl 1320.62010
[51] Wagenmakers, E.-J.; Farrell, S.; Ratcliff, R., Estimation and interpretation of 1/f\(\alpha\) noise in human cognition, Psychonomic Bulletin & Review, 11, 4, 579-615, (2004)
[52] Wetzels, R.; Vandekerckhove, J.; Tuerlinckx, F.; Wagenmakers, E.-J., Bayesian parameter estimation in the expectancy valence model of the iowa gambling task, Journal of Mathematical Psychology, 54, 1, 14-27, (2010) · Zbl 1203.91255
[53] Williams, C. K., Prediction with Gaussian processes: from linear regression to linear prediction and beyond, (Learning in graphical models, (1998), Springer), 599-621 · Zbl 0921.62121
[54] Williams, C. K.; Rasmussen, C. E., Gaussian processes for machine learning, The MIT Press, 2, 3, 4, (2006)
[55] Wilson, A. G.; Adams, R. P., Gaussian process kernels for pattern discovery and extrapolation, (ICML (3), (2013)), 1067-1075
[56] Wilson, A. G.; Dann, C.; Lucas, C.; Xing, E. P., The human kernel, (Advances in neural information processing systems, (2015)), 2854-2862
[57] Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D., & Meder, B. (2017). Exploration and generalization in vast spaces, BioRxiv 171371.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.