## Approximation of Bayesian predictive $$p$$-values with regression ABC.(English)Zbl 06873718

Summary: In the Bayesian framework a standard approach to model criticism is to compare some function of the observed data to a reference predictive distribution. The result of the comparison can be summarized in the form of a $$p$$-value, and computation of some kinds of Bayesian predictive $$p$$-values can be challenging. The use of regression adjustment approximate Bayesian computation (ABC) methods is explored for this task. Two problems are considered. The first is approximation of distributions of prior predictive $$p$$-values for the purpose of choosing weakly informative priors in the case where the model checking statistic is expensive to compute. Here the computation is difficult because of the need to repeatedly sample from a prior predictive distribution for different values of a prior hyperparameter. The second problem considered is the calibration of posterior predictive $$p$$-values so that they are uniformly distributed under some reference distribution for the data. Computation is difficult because the calibration process requires repeated approximation of the posterior for different data sets under the reference distribution. In both these problems we argue that high accuracy in the computations is not required, which makes fast approximations such as regression adjustment ABC very useful. We illustrate our methods with several examples.

### MSC:

 62F15 Bayesian inference 62M20 Inference from stochastic processes and prediction

abc; sm
Full Text:

### References:

 [1] Bayarri, M. J. and Berger, J. O. (2000). “P values for composite null models (with discussion).” Journal of the American Statistical Association, 95: 1127-1142. · Zbl 1004.62022 [2] Bayarri, M. J. and Castellanos, M. E. (2007). “Bayesian checking of the second levels of hierarchical models.” Statistical Science, 22: 322-343. · Zbl 1246.62029 [3] Beaumont, M. A., Zhang, W., and Balding, D. J. (2002). “Approximate Bayesian computation in population genetics.” Genetics, 162: 2025-2035. [4] Blum, M. G. B. (2010). “Approximate Bayesian computation: a nonparametric perspective.” Journal of the American Statistical Association, 105(491): 1178-1187. · Zbl 1390.62052 [5] Blum, M. G. B. and François, O. (2010). “Non-linear regression models for approximate Bayesian computation.” Statistics and Computing, 20: 63-75. [6] Bowman, A. W. and Azzalini, A. (2014). “R package sm: nonparametric smoothing methods (version 2.2-5.4).” http://azzalini.stat.unipd.it/Book_sm [7] Box, G. E. P. (1980). “Sampling and Bayes’ inference in scientific modelling and robustness (with discussion).” Journal of the Royal Statistical Society, Series A, 143: 383-430. · Zbl 0471.62036 [8] Chopin, N. (2002). “A sequential particle filter method for static models.” Biometrika, 89(3): 539-551. · Zbl 1036.62062 [9] Csilléry, K., François, O., and Blum, M. G. B. (2012). “ABC: an R package for approximate Bayesian computation (ABC).” Methods in Ecology and Evolution, 3: 475-479. [10] Dahl, F. A., Gåsemyr, J., and Natvig, B. (2007). “A robust conflict measure of inconsistencies in Bayesian hierarchical models.” Scandinavian Journal of Statistics, 34: 816-828. · Zbl 1157.62011 [11] Del Moral, P., Doucet, A., and Jasra, A. (2006). “Sequential Monte Carlo samplers.” Journal of the Royal Statistical Society, Series B, 68(3): 411-436. · Zbl 1105.62034 [12] Evans, M. and Jang, G. H. (2010). “Invariant P-values for model checking.” The Annals of Statistics, 38: 512-525. · Zbl 1181.62030 [13] Evans, M. and Jang, G. H. (2011). “Weak informativity and the information in one prior relative to another.” Statistical Science, 26: 423-439. · Zbl 1246.62007 [14] Evans, M. and Moshonov, H. (2006). “Checking for prior-data conflict.” Bayesian Analysis, 1: 893-914. · Zbl 1331.62030 [15] Gelman, A. (2006). “Prior distributions for variance parameters in hierarchical models.” Bayesian Analysis, 1: 1-19. · Zbl 1331.62139 [16] Gelman, A. (2013). “Two simple examples for understanding posterior p-values whose distributions are far from unform.” Electronic Journal of Statistics, 7: 2595-2602. · Zbl 1294.62049 [17] Gelman, A., Jakulin, A., Pittau, M. G., and Su, Y.-S. (2008). “A weakly informative default prior distribution for logistic and other regression models.” The Annals of Applied Statistics, 2: 1360-1383. · Zbl 1156.62017 [18] Gelman, A., Meng, X.-L., and Stern, H. (1996). “Posterior predictive assessment of model fitness via realized discrepancies.” Statistica Sinica, 6: 733-807. · Zbl 0859.62028 [19] Gåsemyr, J. and Natvig, B. (2009). “Extensions of a conflict measure of inconsistencies in Bayesian hierarchical models.” Scandinavian Journal of Statistics, 36: 822-838. · Zbl 1222.62037 · doi:10.1111/j.1467-9469.2009.00659.x [20] Guttman, I. (1967). “The use of the concept of a future observation in goodness-of-fit problems.” Journal of the Royal Statistical Society, Series B, 29: 83-100. · Zbl 0158.37305 [21] Hjort, N. L., Dahl, F. A., and Steinbakk, G. H. (2006). “Post-processing posterior predictive p-values.” Journal of the American Statistical Association, 101: 1157-1174. · Zbl 1120.62307 [22] Lebreton, J.-D., Burnham, K. P., Clobert, J., and Anderson, D. R. (1992). “Modeling survival and testing biological hypotheses using marked animals: a unified approach with case studies.” Ecological Monographs, 62(1): 67-118. [23] Marin, J.-M., Pudlo, P., and Robert, C. P. (2015). “Likelihood-free Model Choice.” arXiv:1503.07689. [24] Marin, J.-M., Pudlo, P., Robert, C. P., and Ryder, R. (2011). “Approximate Bayesian computational methods.” Statistics and Computing, 21: 289-291. · Zbl 1252.62022 [25] Marshall, E. C. and Spiegelhalter, D. J. (2007). “Identifying outliers in Bayesian hierarchical models: a simulation-based approach.” Bayesian Analysis, 2: 409-444. · Zbl 1331.62032 [26] Marzolin, G. (1988). “Polygynie du Cincle plongeur (Cinclus cinclus) dans les côtes de Lorraine.” Oiseau et la Revue Francaise d’Ornithologie, 58(4): 277-286. [27] McVinish, R., Mengersen, K., Nur, D., Rousseau, J., and Guihenneuc-Jouyaux, C. (2013). “Recentered importance sampling with applications to Bayesian model validation.” Journal of Computational and Graphical Statistics, 22: 215-228. [28] Moores, M. T., Drovandi, C. C., Mengersen, K., and Robert, C. P. (2015). “Pre-processing for approximate Bayesian computation in image analysis.” Statistics and Computing, 25: 23-33. · Zbl 1331.62158 [29] Nott, D. J., Drovandi, C. C., Mengersen, K., and Evans, M. (2016). “Supplementary material for “Approximation of Bayesian predictive p-values with regression ABC”.” Bayesian Analysis. · Zbl 06873718 [30] O’Hagan, A. (2003). “HSS model criticism (with discussion).” In Green, P. J., Hjort, N. L., and Richardson, S. T. (eds.), Highly Structured Stochastic Systems, 423-453. Oxford University Press. [31] Presanis, A. M., Ohlssen, D., Spiegelhalter, D. J., and Angelis, D. D. (2013). “Conflict diagnostics in directed acyclic graphs, with applications in Bayesian evidence synthesis.” Statistical Science, 28: 376-397. · Zbl 1331.62160 [32] Racine, A., Grieve, A. P., Flühler, H., and Smith, A. F. M. (1986). “Bayesian methods in practice: experiences in the pharmaceutical industry.” Journal of the Royal Statistical Society. Series C (Applied Statistics), 35: 93-150. · Zbl 0635.62106 [33] Robins, J. M., van der Vaart, A., and Ventura, V. (2000). “Asymptotic distribution of $$p$$-values in composite null models.” Journal of the American Statistical Association, 95: 1143-1156. · Zbl 1072.62522 [34] Rubin, D. B. (1984). “Bayesianly justifiable and relevant frequency calculations for the applied statistician.” Annals of Statistics, 12: 1151-1172. · Zbl 0555.62010 [35] Scheel, I., Green, P. J., and Rougier, J. C. (2011). “A graphical diagnostic for identifying influential model choices in Bayesian hierarchical models.” Scandinavian Journal of Statistics, 38(3): 529-550. · Zbl 1246.62064 [36] Steinbakk, G. H. and Storvik, G. O. (2009). “Posterior predictive p-values in Bayesian hierarchical models.” Scandinavian Journal of Statistics, 36: 320-336. · Zbl 1190.62061 [37] Wood, S. N. (2010). “Statistical inference for noisy nonlinear ecological dynamic systems.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.