Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology. (English) Zbl 1234.62166

Summary: Statistical emulators of computer simulators have proven to be useful in a variety of applications. The widely adopted model for emulator building, using a Gaussian process model with strictly positive correlation function, is computationally intractable when the number of simulator evaluations is large. We propose a new model that uses a combination of low-order regression terms and compactly supported correlation functions to recreate the desired predictive behavior of the emulator at a fraction of the computational cost. Following the usual approach of taking the correlation to be a product of correlations in each input dimension, we show how to impose restrictions on the ranges of the correlations, giving sparsity, while also allowing the ranges to trade off against one another, thereby giving good predictive performance. We illustrate the method using data from a computer simulator of photometric redshift with 20,000 simulator evaluations and 80,000 predictions.


62P99 Applications of statistics
68U20 Simulation (MSC2010)
85A40 Astrophysical cosmology
62H20 Measures of association (correlation, canonical correlation, etc.)
62M99 Inference from stochastic processes
62F15 Bayesian inference
65C60 Computational problems in statistics (MSC2010)


Full Text: DOI arXiv


[1] Abbott, T. et al. (2005). The dark energy survey. Preprint. Available at .
[2] An, J. and Owen, A. (2001). Quasi-regression. J. Complexity 17 588-607. · Zbl 0993.65018
[3] Andrieu, C. and Thoms, J. (2008). A tutorial on adaptive MCMC. Stat. Comput. 18 343-373.
[4] Barry, R. P. and Pace, R. K. (1997). Kriging with large data sets using sparse matrix techniques. Comm. Statist. Simulation Comput. 26 619-629. · Zbl 0900.62681
[5] Bayarri, M. J., Berger, J. O., Paulo, R., Sacks, J., Cafeo, J. A., Cavendish, J., Lin, C.-H. and Tu, J. (2007). A framework for validation of computer models. Technometrics 49 138-154.
[6] Berger, J. O., De Oliveira, V. and Sansó, B. (2001). Objective Bayesian analysis of spatially correlated data. J. Amer. Statist. Assoc. 96 1361-1374. · Zbl 1051.62095
[7] Cressie, N. A. C. (1993). Statistics for Spatial Data . Wiley, New York. · Zbl 0799.62002
[8] Denison, D. and George, E. (2000). Bayesian prediction using adaptive ridge estimators. Technical report, Dept. Mathematics, Imperial College, London, UK. · Zbl 1326.62059
[9] Frieman, J. A., Turner, M. S. and Huterer, D. (2008). Dark energy and the accelerating universe. Annual Review of Astronomy and Astrophysics 46 385-432.
[10] Furrer, R., Genton, M. G. and Nychka, D. (2006). Covariance tapering for interpolation of large spatial datasets. J. Comput. Graph. Statist. 15 502-523.
[11] Furrer, R. and Sain, S. R. (2010). spam: A sparse matrix R package with emphasis on MCMC methods for Gaussian Markov random fields. Journal of Statistical Software 36 1-25.
[12] Gneiting, T. (2001). Criteria of Pólya type for radial positive definite functions. Proc. Amer. Math. Soc. 129 2309-2318 (electronic). · Zbl 1008.42012
[13] Gneiting, T. (2002). Compactly supported correlation functions. J. Multivariate Anal. 83 493-508. · Zbl 1011.60015
[14] Golubov, B. I. (1981). On Abel-Poisson type and Riesz means. Anal. Math. 7 161-184. · Zbl 0484.42004
[15] Irvine, K. M., Gitelman, A. I. and Hoeting, J. A. (2007). Spatial designs and properties of spatial correlation: Effects on covariance estimation. J. Agric. Biol. Environ. Stat. 12 450-469. · Zbl 1306.62296
[16] Kaufman, C. G., Schervish, M. J. and Nychka, D. W. (2008). Covariance tapering for likelihood-based estimation in large spatial data sets. J. Amer. Statist. Assoc. 103 1545-1555. · Zbl 1286.62072
[17] Kennedy, M. C. and O’Hagan, A. (2001). Bayesian calibration of computer models. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 425-464. · Zbl 1007.62021
[18] Linkletter, C., Bingham, D., Hengartner, N., Higdon, D. and Ye, K. Q. (2006). Variable selection for Gaussian process models in computer experiments. Technometrics 48 478-490.
[19] McKay, M. D., Beckman, R. J. and Conover, W. J. (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21 239-245. · Zbl 0415.62011
[20] Oakley, J. E. and O’Hagan, A. (2004). Probabilistic sensitivity analysis of complex models: A Bayesian approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 751-769. · Zbl 1046.62027
[21] Oyaizu, H., Cunha, C., Lima, M., Lin, H. and Frieman, J. (2006). Photometric redshifts for the Dark Energy Survey. In Bulletin of the American Astronomical Society 38 140.
[22] Paulo, R. (2005). Default priors for Gaussian processes. Ann. Statist. 33 556-582. · Zbl 1069.62030
[23] Perlmutter, S., Aldering, G., Goldhaber, G., Knop, R., Nugent, P., Castro, P., Deustua, S., Fabbro, S., Goobar, A., Groom, D. et al. (1999). Measurements of [Omega] and [Lambda] from 42 high-redshift supernovae. The Astrophysical Journal 517 565-586. · Zbl 1368.85002
[24] Pissanetzky, S. (1984). Sparse Matrix Technology . Academic Press, London. · Zbl 0536.65019
[25] Riess, A. G., Filippenko, A. V., Challis, P., Clocchiatti, A., Diercks, A., Garnavich, P. M., Gilliland, R. L., Hogan, C. J., Jha, S., Kirshner, R. P. et al. (1998). Observational evidence from supernovae for an accelerating universe and a cosmological constant. Astronomical Journal 116 1009-1038.
[26] Roberts, G. O. and Rosenthal, J. S. (2009). Examples of adaptive MCMC. J. Comput. Graph. Statist. 18 349-367.
[27] Sacks, J., Welch, W. J., Mitchell, T. J. and Wynn, H. P. (1989). Design and analysis of computer experiments. Statist. Sci. 4 409-435. · Zbl 0955.62619
[28] Santner, T. J., Williams, B. J. and Notz, W. I. (2003). The Design and Analysis of Computer Experiments . Springer, New York. · Zbl 1041.62068
[29] Shaby, B. and Wells, M. T. (2011). Exploring an adaptive Metropolis algorithm. Technical Report 2011-14, Dept. Statistical Science, Duke Univ., Durham, NC.
[30] Stein, M. L. (2008). A modeling approach for large spatial datasets. J. Korean Statist. Soc. 37 3-10. · Zbl 1196.62123
[31] Stein, M. L., Chi, Z. and Welty, L. J. (2004). Approximating likelihoods for large spatial data sets. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 275-296. · Zbl 1062.62094
[32] Tang, B. (1993). Orthogonal array-based Latin hypercubes. J. Amer. Statist. Assoc. 88 1392-1397. · Zbl 0792.62066
[33] Welch, W. J., Buck, R. J., Sacks, J., Wynn, H. P., Mitchell, T. J. and Morris, M. D. (1992). Screening, predicting, and computer experiments. Technometrics 34 15-25.
[34] Wikle, C. K. (2010). Low-rank representations for spatial processes. In Handbook of Spatial Statistics (A. E. Gelfand, P. Diggle, M. Fuentes and P. Guttorp, eds.) 107-118. CRC Press, Boca Raton, FL.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.