×

Practical Hilbert space approximate Bayesian Gaussian processes for probabilistic programming. (English) Zbl 1502.62024

Summary: Gaussian processes are powerful non-parametric probabilistic models for stochastic functions. However, the direct implementation entails a complexity that is computationally intractable when the number of observations is large, especially when estimated with fully Bayesian methods such as Markov chain Monte Carlo. In this paper, we focus on a low-rank approximate Bayesian Gaussian processes, based on a basis function approximation via Laplace eigenfunctions for stationary covariance functions. The main contribution of this paper is a detailed analysis of the performance, and practical recommendations for how to select the number of basis functions and the boundary factor. Intuitive visualizations and recommendations, make it easier for users to improve approximation accuracy and computational performance. We also propose diagnostics for checking that the number of basis functions and the boundary factor are adequate given the data. The approach is simple and exhibits an attractive computational complexity due to its linear structure, and it is easy to implement in probabilistic programming frameworks. Several illustrative examples of the performance and applicability of the method in the probabilistic programming language Stan are presented together with the underlying Stan model code.

MSC:

62-08 Computational methods for problems pertaining to statistics
60G15 Gaussian processes
62F15 Bayesian inference
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Abramowitz, M.; Stegun, I., Handbook of Mathematical Functions (1970), New York: Dover Publishing, New York · Zbl 0171.38503
[2] Adler, RJ, The Geometry of Random Fields (1981), Philadelphia: SIAM, Philadelphia · Zbl 0478.60059
[3] Akhiezer, NI; Glazman, IM, Theory of Linear Operators in Hilbert Space (1993), New York: Dover, New York · Zbl 0874.47001
[4] Andersen, MR; Vehtari, A.; Winther, O.; Hansen, LK, Bayesian inference for spatio-temporal spike-and-slab priors, J. Mach. Learn. Res., 18, 139, 1-58 (2017) · Zbl 1442.62049
[5] Betancourt, M.: A conceptual introduction to Hamiltonian Monte Carlo. arXiv preprint arXiv:1701.02434 (2017)
[6] Betancourt, M., Girolami, M.: Hamiltonian Monte Carlo for hierarchical models. In: Current Trends in Bayesian Methodology with Applications. Chapman and Hall/CRC, pp. 79-101 (2019)
[7] Briol, F.X., Oates, C., Girolami, M., Osborne, M.A., Sejdinovic, D.: Probabilistic integration: a role in statistical computation? arXiv preprint arXiv:1512.00933 (2015) · Zbl 1420.62135
[8] Brooks, S.; Gelman, A.; Jones, G.; Meng, XL, Handbook of Markov Chain Monte Carlo (2011), London: CRC Press, London · Zbl 1218.65001
[9] Bürkner, PC, brms: an R package for Bayesian multilevel models using Stan, J. Stat. Softw., 80, 1, 1-28 (2017)
[10] Burt, D., Rasmussen, C.E., Van Der Wilk, M.: Rates of convergence for sparse variational Gaussian process regression. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, PMLR, Proceedings of Machine Learning Research, vol. 97, pp. 862-871 (2019)
[11] Carlin, BP; Gelfand, AE; Banerjee, S., Hierarchical Modeling and Analysis for Spatial Data (2014), London: Chapman and Hall/CRC, London · Zbl 1358.62009
[12] Carpenter, B.; Gelman, A.; Hoffman, MD; Lee, D.; Goodrich, B.; Betancourt, M.; Brubaker, M.; Guo, J.; Li, P.; Riddell, A., Stan: a probabilistic programming language, J. Stat. Softw., 76, 1, 1-32 (2017)
[13] Cramér, H.; Leadbetter, MR, Stationary and Related Stochastic Processes: Sample Function Properties and Their Applications (2013), North Chelmsford: Courier Corporation, North Chelmsford · Zbl 0162.21102
[14] Csató, L., Fokoué, E., Opper, M., Schottky, B., Winther, O.: Efficient approaches to Gaussian process classification. In: Advances in Neural Information Processing Systems, pp. 251-257 (2000)
[15] Deisenroth, MP; Fox, D.; Rasmussen, CE, Gaussian processes for data-efficient learning in robotics and control, IEEE Trans. Pattern Anal. Mach. Intell., 37, 2, 408-423 (2015)
[16] Diggle, PJ, Statistical Analysis of Spatial and Spatio-temporal Point Patterns (2013), London: Chapman and Hall/CRC, London · Zbl 1435.62004
[17] Furrer, EM; Nychka, DW, A framework to understand the asymptotic properties of kriging and splines, J. Korean Stat. Soc., 36, 1, 57-76 (2007) · Zbl 1115.62321
[18] Gal, Y., Turner, R.: Improving the Gaussian process sparse spectrum approximation by representing uncertainty in frequency inputs. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning, PMLR, Proceedings of Machine Learning Research, vol. 37, pp. 655-664 (2015)
[19] Gelman, A.; Carlin, JB; Stern, HS; Dunson, DB; Vehtari, A.; Rubin, DB, Bayesian Data Analysis (2013), London: Chapman and Hall/CRC, London · Zbl 1279.62004
[20] Gelman, A.; Hill, J.; Vehtari, A., Regression and Other Stories (2020), Cambridge: Cambridge University Press, Cambridge · Zbl 1476.62007
[21] Gibbs, MN; MacKay, DJ, Variational Gaussian process classifiers, IEEE Trans. Neural Netw., 11, 6, 1458-1464 (2000)
[22] GPy: GPy: a Gaussian process framework in Python. http://github.com/SheffieldML/GPy (2012)
[23] Grenander, U., Abstract Inference (1981), Hoboken, NJ: Wiley, Hoboken, NJ · Zbl 0505.62069
[24] Hennig, P.; Osborne, MA; Girolami, M., Probabilistic numerics and uncertainty in computations, Proc. R. Soc. A: Math. Phys. Eng. Sci., 471, 2179, 20150142 (2015) · Zbl 1372.65010
[25] Hensman, J.; Durrande, N.; Solin, A., Variational Fourier features for Gaussian processes, J. Mach. Learn. Res., 18, 1, 5537-5588 (2017) · Zbl 1467.62152
[26] Jo, S.; Choi, T.; Park, B.; Lenk, P., bsamGP: an R package for Bayesian spectral analysis models using Gaussian process priors, J. Stat. Softw. Artic., 90, 10, 1-41 (2019)
[27] Lázaro Gredilla, M.: Sparse Gaussian processes for large-scale machine learning. Ph.D. thesis, Universidad Carlos III de Madrid (2010)
[28] Lindgren, F.; Bolin, D.; Rue, H., The SPDE approach for Gaussian and non-Gaussian fields: 10 years and still running, Spatial Stat., 50, 100599 (2022)
[29] Loève, M., Probability Theory (1977), New York: Springer-Verlag, New York · Zbl 0359.60001
[30] Matthews, AGG; van der Wilk, M.; Nickson, T.; Fujii, K.; Boukouvalas, A.; León-Villagrá, P.; Ghahramani, Z.; Hensman, J., GPflow: a Gaussian process library using TensorFlow, J. Mach. Learn. Res., 18, 40, 1-6 (2017) · Zbl 1437.62127
[31] Minka, T.P.: Expectation propagation for approximate Bayesian inference. In: Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc., pp. 362-369 (2001)
[32] Neal, R.M.: Monte Carlo implementation of Gaussian process models for Bayesian regression and classification. arXiv preprint physics/9701026 (1997)
[33] Quiñonero-Candela, J.; Rasmussen, CE, A unifying view of sparse approximate Gaussian process regression, J. Mach. Learn. Res., 6, Dec, 1939-1959 (2005) · Zbl 1222.68282
[34] Quiñonero-Candela, J.; Rasmussen, CE; Figueiras-Vidal, AR, Sparse spectrum Gaussian process regression, J. Mach. Learn. Res., 11, Jun, 1865-1881 (2010) · Zbl 1242.62098
[35] R Core Team: R: a language and environment for statistical computing. http://www.R-project.org/ (2019)
[36] Rahimi, A.; Recht, B.; Platt, JC; Koller, D.; Singer, Y.; Roweis, ST, Random features for large-scale kernel machines, Advances in Neural Information Processing Systems, 1177-1184 (2008), Red Hook: Curran Associates Inc., Red Hook
[37] Rahimi, A.; Recht, B.; Koller, D.; Schuurmans, D.; Bengio, Y.; Bottou, L., Weighted sums of random kitchen sinks: replacing minimization with randomization in learning, Advances in Neural Information Processing Systems, 1313-1320 (2009), Red Hook: Curran Associates Inc, Red Hook
[38] Rasmussen, CE; Nickisch, H., Gaussian processes for machine learning (GPML) toolbox, J. Mach. Learn. Res., 11, 3011-3015 (2010) · Zbl 1242.68242
[39] Rasmussen, CE; Williams, CK, Gaussian Processes for Machine Learning (2006), Cambridge: MIT Press, Cambridge · Zbl 1177.68165
[40] Roberts, S.J.: Bayesian Gaussian processes for sequential prediction, optimisation and quadrature. Ph.D. thesis, University of Oxford (2010)
[41] Solin, A., Särkkä, S.: Explicit link between periodic covariance functions and state space models. In: Kaski, S., Corander, J. (eds.) Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR, Proceedings of Machine Learning Research, vol. 33, pp. 904-912 (2014)
[42] Solin, A., Särkkä, S.: Hilbert space methods for reduced-rank Gaussian process regression. Stat. Comput. 30(2), 419-446 (2020). much of the work in this paper is based on the pre-print version predating the published paper. Pre-print available at https://arxiv.org/abs/1401.5508 · Zbl 1436.62316
[43] Särkkä, S.; Solin, A.; Hartikainen, J., Spatiotemporal learning via infinite-dimensional Bayesian filtering and smoothing: a look at Gaussian process regression through Kalman filtering, IEEE Signal Process. Mag., 30, 4, 51-61 (2013)
[44] Stan Development Team: Stan modeling language users guide and reference manual, 2.28. https://mc-stan.org (2021)
[45] Van Trees, H.L.: Detection, Estimation, and Modulation Theory, Part I: Detection, Estimation, and Linear Modulation Theory. John Wiley & Sons, New York, NY (1968) · Zbl 0202.18002
[46] Vanhatalo, J., Riihimäki, J., Hartikainen, J., Jylänki, P., Tolvanen, V., Vehtari, A.: GPstuff: Bayesian modeling with Gaussian processes. J. Mach. Learn. Res. 14(1), 1175-1179 (2013) · Zbl 1320.62010
[47] Vehtari, A.; Ojanen, J., A survey of Bayesian predictive methods for model assessment, selection and comparison, Stat. Surv., 6, 142-228 (2012) · Zbl 1302.62011
[48] Vehtari, A., Gelman, A., Gabry, J.: Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. 27(5), 1413-1432 (2017) · Zbl 1505.62408
[49] Wahba, G., Spline Models for Observational Data (1990), Philadelphia: SIAM, Philadelphia · Zbl 0813.62001
[50] Williams, CK; Barber, D., Bayesian classification with Gaussian processes, IEEE Trans. Pattern Anal. Mach. Intell., 20, 12, 1342-1351 (1998) · Zbl 04549972
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.