×

Significance testing with no alternative hypothesis: A measure of surprise. (English) Zbl 1171.62029

Summary: A pure significance test would check the agreement of a statistical model with the observed data even when no alternative model was available. The paper proposes the use of a modified \(p\)-value to make such a test. The model will be rejected if something surprising is observed (relative to what else might have been observed). It is shown that the relation between this measure of surprise (the \(s\)-value) and the surprise indices of W. Weaver [Sci. Monthly 67, 390–392 (1948)] and I. J. Good [Ann. Math. Stat. 27, 1130–1135 (1956; Zbl 0073.13501); Corrections ibid. 28, 1055 (1957)] is similar to the relationship between a \(p\)-value, a corresponding odds-ratio, and a logit or log-odds statistic. The \(s\)-value is always larger than the corresponding \(p\)-value, and is not uniformly distributed. Difficulties with the whole approach are discussed.

MSC:

62G10 Nonparametric hypothesis testing

Citations:

Zbl 0073.13501
Full Text: DOI

References:

[1] Bayarri, M. J., & Berger, J. O. (1999). Quantifying surprise in the data and model verification (with discussion). In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics (Vol. 6, pp. 53–82). London: Oxford University Press. · Zbl 0974.62021
[2] Bayarri, M. J., & Berger, J. O. (2000). P values for composite null models (with Robins et al. (2000) and discussion). Journal of the American Statistical Association, 95, 1127–1172. · Zbl 1004.62022 · doi:10.2307/2669749
[3] Church, A. (1940). On the concept of a random sequence. Bulletin of the American Mathematical Society, 46, 130–135. · Zbl 0022.36904 · doi:10.1090/S0002-9904-1940-07154-X
[4] Dawid, A. P., & Vovk, V. (1999). Prequential probability: Principles and properties. Bernoulli, 5, 125–162. · Zbl 0929.60001 · doi:10.2307/3318616
[5] Evans, M. (1997). Bayesian inference procedures derived via the concept of relative surprise. Communications in Statistics, 26, 1125–1143. · Zbl 0934.62026 · doi:10.1080/03610929708831972
[6] Evans, M., Guttman, I., & Swartz, T. (2006). Optimality computations for relative surprise inferences. Canadian Journal of Statistics, 34, 113–129. · Zbl 1095.62004 · doi:10.1002/cjs.5550340109
[7] Good, I. J. (1954). The appropriate mathematical tools for describing and measuring uncertainty. In: C. F. Carter, G. P. Meredith, & G. L. S. Sheckle (Eds.), Uncertainty and business decisions. Liverpool: University Press.
[8] Good, I. J. (1956). The surprise index for the multivariate normal distribution. Annals of Mathematical Statistics, 27, 1130–1135. · Zbl 0073.13501 · doi:10.1214/aoms/1177728079
[9] Good, I. J. (1988). Surprise index. In: S. Kotz, N. L. Johnson, & C. B. Reid (Eds.), Encyclopaedia of statistical sciences (Vol. 7). New York: Wiley.
[10] Howard, J. V. (1975). Computable explanations. Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik, 21, 215–224. · Zbl 0326.60036 · doi:10.1002/malq.19750210129
[11] Jahn, R. G., Dunne, B. J., & Nelson, R. D. (1987). Engineering anomalies research. Journal of Scientific Exploration, 1, 21–50.
[12] Jeffreys, H. (1939, 1961). Theory of probability. Oxford: Oxford University Press. · Zbl 0023.14501
[13] Jefferys, W. H. (1990). Bayesian analysis of random event generator data. Journal of Scientific Exploration, 4, 153–169.
[14] Lindley, D. V. (1957). A statistical paradox (with discussion). Biometrika, 44, 187–192. · Zbl 0080.12801
[15] Lindley, D. V. (1977). A problem in forensic science. Biometrika, 64, 207–213. · doi:10.1093/biomet/64.2.207
[16] Martin-Löf, P. (1966). The definition of random sequences. Information and Control, 9, 602–619. · Zbl 0244.62008 · doi:10.1016/S0019-9958(66)80018-9
[17] Robins, J. M., van der Vaart, A., & Ventura, V. (2000). Asymptotic distribution of P values for composite null models (with Bayarri and Berger (2000) and discussion). Journal of the American Statistical Association, 95, 1127–1172. · Zbl 1004.62022 · doi:10.2307/2669749
[18] Seillier-Moiseiwitsch, F., Sweeting, T. J., & Dawid, A. P. (1992). Prequential tests of model fit. Scandinavian Journal of Statistics, 19, 45–60. · Zbl 0755.62032
[19] Seillier-Moiseiwitsch, F., & Dawid, A. P. (1993). On testing the validity of sequential probability forecasts. Journal of the American Statistical Association, 88, 355–359. · Zbl 0771.62058 · doi:10.2307/2290731
[20] Shafer, G. (1982). Lindley’s paradox (with discussion). Journal of the American Statistical Association, 77, 325–351. · Zbl 0491.62004 · doi:10.2307/2287244
[21] Shafer, G., & Vovk, V. G. (2001). Probability and finance: It’s only a game. New York: Wiley. · Zbl 0985.91024
[22] Ville, J. (1939). Étude critique de la notion de collectif. Paris: Gauthier-Villars. · Zbl 0021.14601
[23] von Mises, R. (1919). Grundlagen der Wahrscheinkeitstheorie. Mathematische Zeitschrift, 5, 52–99. · JFM 47.0483.01 · doi:10.1007/BF01203155
[24] Weaver, W. (1948). Probability, rarity, interest and surprise. Scientific Monthly, 67, 390–392.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.