Quantifying surprise in the data and model verification. (With discussion).

*(English)* Zbl 0974.62021
Bernardo, J. M. (ed.) et al., Bayesian statistics 6. Proceedings of the 6th Valencia international meeting, Alcoceber near Valencia, Spain, June 6-10, 1998. Oxford: Clarendon Press. 53-82 (1999).

Summary: $P$-values are often perceived as measurements of the degree of surprise in the data, relative to a hypothesized model. They are also commonly used in model (or hypothesis) verification, i.e., to provide a basis for rejection of a model or hypothesis. We first make a distinction between these two goals: quantifying surprise can be important in deciding whether or not to search for alternative models, but is questionable as the basis for rejection of a model. For measuring surprise, we propose a simple calibration of the $p$-value which roughly converts a tail area into a Bayes factor or `oddsâ€™ measure. Many Bayesians have suggested certain modifications of $p$-values for use in measuring surprise, including the predictive $p$-value and the posterior predictive $p$-value. We propose two alternatives, the conditional predictive $p$-value and the partial posterior predictive $p$-value, which we argue to be more acceptable from Bayesisn (or conditional) reasoning. For the entire collection see [

Zbl 0942.00036].