Quantifying surprise in the data and model verification. (With discussion). (English) Zbl 0974.62021
Bernardo, J. M. (ed.) et al., Bayesian statistics 6. Proceedings of the 6th Valencia international meeting, Alcoceber near Valencia, Spain, June 6-10, 1998. Oxford: Clarendon Press. 53-82 (1999).
Summary: $P$-values are often perceived as measurements of the degree of surprise in the data, relative to a hypothesized model. They are also commonly used in model (or hypothesis) verification, i.e., to provide a basis for rejection of a model or hypothesis. We first make a distinction between these two goals: quantifying surprise can be important in deciding whether or not to search for alternative models, but is questionable as the basis for rejection of a model. For measuring surprise, we propose a simple calibration of the $p$-value which roughly converts a tail area into a Bayes factor or `odds’ measure. Many Bayesians have suggested certain modifications of $p$-values for use in measuring surprise, including the predictive $p$-value and the posterior predictive $p$-value. We propose two alternatives, the conditional predictive $p$-value and the partial posterior predictive $p$-value, which we argue to be more acceptable from Bayesisn (or conditional) reasoning. For the entire collection see [Zbl 0942.00036