A Bayesian \(\chi^2\) test for goodness-of-fit. (English) Zbl 1068.62028

Summary: This article describes an extension of classical \(\chi^2\) goodness-of-fit tests to Bayesian model assessment. The extension, which essentially involves evaluating Pearson’s goodness-of-fit statistic at a parameter value drawn from its posterior distribution, has the important property that it is asymptotically distributed as a \(\chi^2\) random variable on \(K-1\) degrees of freedom, independently of the dimension of the underlying parameter vector.
By examining the posterior distribution of this statistic, global goodness-of-fit diagnostics are obtained. Advantages of these diagnostics include ease of interpretation, computational convenience and favorable power properties. The proposed diagnostics can be used to assess the adequacy of a broad class of Bayesian models, essentially requiring only a finite-dimensional parameter vector and conditionally independent observations.


62F15 Bayesian inference
62G10 Nonparametric hypothesis testing
62C10 Bayesian problems; characterization of Bayes procedures
62E20 Asymptotic distribution theory in statistics


WinBUGS; Gibbsit
Full Text: DOI arXiv


[1] Bayarri, M. J. and Berger, J. O. (2000). \(P\) values for composite null models (with discussion). J. Amer. Statist. Assoc. 95 1127–1142, 1157–1170. · Zbl 1004.62022
[2] Berger, J. O. and Pericchi, L. R. (1996). The intrinsic Bayes factor for model selection and prediction. J. Amer. Statist. Assoc. 91 109–122. · Zbl 0870.62021
[3] Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion). J. Roy. Statist. Soc. Ser. B 36 192–236. · Zbl 0327.60067
[4] Best, D. J. and Rayner, J. C. W. (1981). Are two classes enough for the \(\chi^2\) goodness of fit test. Statist. Neerlandica 35 157–163. · Zbl 0475.62035
[5] Chen, C. F. (1985). On asymptotic normality of limiting density functions with Bayesian implications. J. Roy. Statist. Soc. Ser. B 47 540–546. · Zbl 0607.62015
[6] Chernoff, H. and Lehmann, E. L. (1954). The use of maximum likelihood estimates in \(\chi^2\) tests for goodness of fit. Ann. Math. Statist. 25 579–586. · Zbl 0056.37103
[7] Clayton, D. G. and Kaldor, J. (1987). Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. Biometrics 43 671–681.
[8] Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics . Chapman and Hall, London. · Zbl 0334.62003
[9] Cramér, H. (1946). Mathematical Methods of Statistics . Princeton Univ. Press. · Zbl 0063.01014
[10] Dahiya, R. C. and Gurland, J. (1973). How many classes in the Pearson chi-square test? J. Amer. Statist. Assoc. 68 707–712. · Zbl 0267.62022
[11] de la Horra, J. and Rodríguez-Bernal, M. T. (1997). Asymptotic behavior of the posterior predictive \(P\)-value. Comm. Statist. Theory Methods 26 2689–2699. · Zbl 0954.62529
[12] Dey, D. K., Gelfand, A. E., Swartz, T. B. and Vlachos, P. K. (1998). A simulation-intensive approach for checking hierarchical models. Test 7 325–346. · Zbl 0935.62082
[13] Fienberg, S. E. (1980). The Analysis of Cross-Classified Categorical Data , 2nd ed. MIT Press. · Zbl 0499.62049
[14] Gelfand, A. E. (1996). Model determination using sampling-based methods. In Markov Chain Monte Carlo in Practice (W. R. Gilks, S. Richardson and D. J. Spiegelhalter, eds.) 145–162. Chapman and Hall, London. · Zbl 0840.62003
[15] Gelman, A. and Meng, X.-L. (1996). Model checking and model improvement. In Markov Chain Monte Carlo in Practice (W. R. Gilks, S. Richardson and D. J. Spiegelhalter, eds.) 189–202. Chapman and Hall, London. · Zbl 0839.62021
[16] Gelman, A., Meng, X.-L. and Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Statist. Sinica 6 733–807. · Zbl 0859.62028
[17] Guttman, I. (1967). The use of the concept of a future observation in goodness-of-fit problems. J. Roy. Statist. Soc. Ser. B 29 83–100. · Zbl 0158.37305
[18] Gvanceladze, L. G. and Chibisov, D. M. (1979). On tests of fit based on grouped data. In Contributions to Statistics : Jaroslav Hájek Memorial Volume (J. Jurecková, ed.) 79–89. Academia, Prague. · Zbl 0418.62034
[19] Hamdan, M. (1963). The number and width of classes in the chi-square test. J. Amer. Statist. Assoc. 58 678–689. · Zbl 0114.10503
[20] Hanley, J. A. and McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143 29–36.
[21] Kallenberg, W. C. M., Oosterhoff, J. and Schriever, B. F. (1985). The number of classes in chi-squared goodness-of-fit tests. J. Amer. Statist. Assoc. 80 959–968. · Zbl 0582.62037
[22] Koehler, K. J. and Gan, F. F. (1990). Chi-squared goodness-of-fit tests: Cell selection and power. Comm. Statist. Simulation Comput. 19 1265–1278. · Zbl 0850.62382
[23] Mann, H. B. and Wald, A. (1942). On the choice of the number of class intervals in the application of the chi-square test. Ann. Math. Statist. 13 306–317. · Zbl 0063.03772
[24] Moore, D. S. and Spruill, M. C. (1975). Unified large-sample theory of general chi-squared statistics for tests of fit. Ann. Statist. 3 599–616. · Zbl 0322.62047
[25] O’Hagan, A. (1995). Fractional Bayes factors for model comparison (with discussion). J. Roy. Statist. Soc. Ser. B 57 99–138. · Zbl 0813.62026
[26] Olver, F. W. J. (1974). Asymptotics and Special Functions . Academic Press, New York. · Zbl 0303.41035
[27] Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine 50 157–175. · JFM 31.0238.04
[28] Quine, M. P. and Robinson, J. (1985). Efficiencies of chi-square and likelihood ratio goodness-of-fit tests. Ann. Statist. 13 727–742. JSTOR: · Zbl 0576.62061
[29] Raftery, A. E. and Lewis, S. (1992). How many iterations in the Gibbs sampler? In Bayesian Statistics 4 (J. M. Bernardo, J. Berger, A. P. Dawid and A. F. M. Smith, eds.) 763–773. Oxford Univ. Press.
[30] Robert, C. P. and Rousseau, J. (2002). A mixture approach to Bayesian goodness of fit.
[31] Robins, J. M., van der Vaart, A. and Ventura, V. (2000). Asymptotic distribution of \(P\) values in composite null models (with discussion). J. Amer. Statist. Assoc. 95 1143–1167, 1171–1172. · Zbl 1072.62522
[32] Rubin, D. B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Statist. 12 1151–1172. JSTOR: · Zbl 0555.62010
[33] Sinharay, S. and Stern, H. S. (2003). Posterior predictive model checking in hierarchical models. J. Statist. Plann. Inference 111 209–221. · Zbl 1033.62027
[34] Spiegelhalter, D., Best, N., Carlin, B. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 583–639. · Zbl 1067.62010
[35] Spiegelhalter, D., Thomas, A. and Best, N. (2000). WinBUGS Version 1.3 Users Manual. Medical Research Council Biostatistics Unit, Cambridge. Available at www.mrc-bsu.cam.ac.uk/bugs.
[36] Verdinelli, I. and Wasserman, L. (1998). Bayesian goodness-of-fit testing using infinite-dimensional exponential families. Ann. Statist. 26 1215–1241. · Zbl 0930.62027
[37] Watson, G. S. (1957). The \(\chi^2\) goodness-of-fit test for normal distributions. Biometrika 44 336–348. · Zbl 0081.36002
[38] Williams, C. (1950). On the choice of the number and width of classes for the chi-square test of goodness-of-fit. J. Amer. Statist. Assoc. 45 77–86. · Zbl 0035.09104
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.