Post-processing posterior predictive $p$ values. (English) Zbl 1120.62307
Summary: This article addresses issues of model criticism and model comparison in Bayesian contexts, and focuses on the use of the so-called posterior predictive $p$ values (ppp). These involve a general discrepancy or conflict measure and depend on the prior, the model, and the data. They are used in statistical practice to quantify the degree of surprise or conflict in data and to compare different combinations of prior and model. The distribution of such ppp values is far from uniform however, as we demonstrate for different models, making their interpretation and comparison a difficult matter. We propose a natural calibration of the ppp values, where the resulting cppp values are uniform on the unit interval under model conditions. The cppp values, which in general rely on a double-simulation scheme for their computation, may then be used to assess and compare different priors and models. Our methods also make it possible to compare parametric and nonparametric model specifications, in that genuine ”measures of surprise” are put on the same canonical uniform scale. We illustrate our techniques for some applications to real data. We also present supplementing theoretical results on various properties of the ppp and cppp.