×

zbMATH — the first resource for mathematics

A survey of Bayesian predictive methods for model assessment, selection and comparison. (English) Zbl 1302.62011
Stat. Surv. 6, 142-228 (2012); errata ibid. 8, 1 (2014).
Summary: To date, several methods exist in the statistical literature for model assessment, which purport themselves specifically as Bayesian predictive methods. The decision theoretic assumptions on which these methods are based are not always clearly stated in the original articles, however. The aim of this survey is to provide a unified review of Bayesian predictive model assessment and selection methods, and of methods closely related to them. We review the various assumptions that are made in this context and discuss the connections between different approaches, with an emphasis on how each method approximates the expected utility of using a Bayesian model for the purpose of predicting future data.

MSC:
62C10 Bayesian problems; characterization of Bayes procedures
62-02 Research exposition (monographs, survey articles) pertaining to statistics
PDF BibTeX XML Cite
Full Text: DOI Euclid
References:
[1] Aitkin, M. (1991). Posterior Bayes Factors (with discussion). Journal of the Royal Statistical Society. Series B (Methodological) 53 111-142. · Zbl 0800.62167
[2] Akaike, H. (1973). Information Theory and an Extension of the Maximum Likelihood Principle. In Second International Symposium on Information Theory ( B. N. Petrov and F. Csaki, eds.) 267-281. Academiai Kiado, Budapest. Reprinted in Kotz, S. and Johnson, N. L., editors, (1992). Breakthroughs in Statistics Volume I: Foundations and Basic Theory , pp. 610-624. Springer-Verlag. · Zbl 0283.62006
[3] Akaike, H. (1974). A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control AC-19 716-723. · Zbl 0314.62039 · doi:10.1109/TAC.1974.1100705
[4] Akaike, H. (1979). A Bayesian Extension of the Minimum AIC Procedure of Autoregressive Model Fitting. Biometrika 66 237-242. · Zbl 0407.62064 · doi:10.1093/biomet/66.2.237
[5] Ando, T. (2007). Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models. Biometrika 94 443-458. · Zbl 1132.62005 · doi:10.1093/biomet/asm017
[6] Ando, T. and Tsay, R. (2010). Predictive likelihood for Bayesian model selection and averaging. International Journal of Forecasting 26 744-763.
[7] Arlot, S. and Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics surveys 4 40-79. · Zbl 1190.62080 · doi:10.1214/09-SS054
[8] Barbieri, M. M. and Berger, J. O. (2004). Optimal Predictive Model Selection. The Annals of Statistics 32 870-897. · Zbl 1092.62033 · doi:10.1214/00905360400000023
[9] Bayarri, M. J. (1987). Comment to J. O. Berger and M. Delampady. Statistical Science 3 342-344.
[10] Bayarri, M. J. (2003). Which ‘base’ distribution for model criticism? In Highly Structured Stochastic Systems ( P. J. Green, N. L. Hjort and S. Richardson, eds.) 445-453. Oxford University Press.
[11] Bayarri, M. J. and Berger, J. O. (1999). Quantifying Surprise in the Data and Model Verification. In Bayesian Statistics 6 ( J. M. Bernardo, J. O. Berger and A. P. Dawid, eds.) 53-82. Oxford University Press. · Zbl 0974.62021
[12] Bayarri, M. J. and Berger, J. O. (2000). P Values for Composite Null Models. Journal of the American Statistical Association 95 1127-1142. · Zbl 1004.62022 · doi:10.2307/2669749
[13] Berger, J. O. (1985). Statistical Decision Theory and Bayesian Analysis , 2nd ed. Springer-Verlag. · Zbl 0572.62008
[14] Berger, J. O. and Bernardo, J. M. (1992). On the Development of Reference Priors. In Bayesian Statistics 4 ( J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 35-60. Oxford University Press.
[15] Berger, J. and Pericchi, L. (1996). The intrinsic Bayes factor for model selection and prediction. Journal of the American Statistical Association 91 109-122. · Zbl 0870.62021 · doi:10.2307/2291387
[16] Bernardo, J. M. (1979). Expected Information as Expected Utility. Annals of Statistics 7 686-690. · Zbl 0407.62002 · doi:10.1214/aos/1176344689
[17] Bernardo, J. M. (1999). Nested Hypothesis Testing: The Bayesian Reference Criterion. In Bayesian Statistics 6 ( J. M. Bernardo, J. O. Berger and A. P. Dawid, eds.) 101-130. Oxford University Press. · Zbl 0973.62019
[18] Bernardo, J. M. (2005a). Reference Analysis. In Handbook of Statistics , ( D. Dey and C. R. Rao, eds.) 25 Elsevier 17-90. · Zbl 0682.62018 · doi:10.2307/2289864
[19] Bernardo, J. M. (2005b). Intrinsic credible regions: An objective Bayesian approach to interval estimation. Test 14 . 317-384. · Zbl 1087.62036 · doi:10.1007/BF02595408
[20] Bernardo, J. M. and Bayarri, M. J. (1985). Bayesian model criticism. In Model choice: proceedings of the 4th Franco-Belgian meeting of statisticians ( J. P. Florens, M. Mouchart, J. P. Raoult and L. Simar, eds.). Facultés universitaires Saint-Louis, Bruxelles.
[21] Bernardo, J. M. and Bermúdez, J. D. (1985). The Choice of Variables in Probabilistic Classification. In Bayesian Statistics 2 ( J. M. Bernardo, M. H. deGroot, D. V. Lindley and A. F. M. Smith, eds.) 67-82. Elsevier Science Publishers. · Zbl 0671.62059
[22] Bernardo, J. M. and Juárez, M. A. (2003). Intrinsic Estimation. In Bayesian Statistics 7 ( J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.) 456-476. Oxford University Press. · Zbl 1044.62002
[23] Bernardo, J. M. and Rueda, R. (2002). Bayesian hypothesis testing: a reference approach. International Statistical Review 70 351-372. · Zbl 1211.62011 · doi:10.2307/1403862
[24] Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory . John Wiley & Sons. · Zbl 0796.62002
[25] Bhattacharya, S. and Haslett, J. (2007). Importance Re-sampling MCMC for Cross-Validation in Inverse Problems. Bayesian Analysis 2 385-408. · Zbl 1331.86025 · doi:10.1214/07-BA217
[26] Birgé, L. and Massart, P. (2007). Minimal Penalties for Gaussian Model Selection. Probability Theory and Related Fields 138 33-73. · Zbl 1112.62082 · doi:10.1007/s00440-006-0011-8
[27] Bornn, L., Doucet, A. and Gottardo, R. (2010). An efficient computational approach for prior sensitivity analysis and cross-validation. The Canadian Journal of Statistics 38 47-64. · Zbl 1190.62046
[28] Box, G. E. P. (1980). Sampling and Bayes’ Inference in Scientific Modelling and Robustness. Journal of the Royal Statistical Society. Series A (General) 143 383-430. · Zbl 0471.62036 · doi:10.2307/2982063
[29] Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees . Chapman and Hall. · Zbl 0541.62042
[30] Brown, P. J., Fearn, T. and Vannucci, M. (1999). The choice of variables in multivariate regression: a non-conjugate Bayesian decision theory approach. Biometrika 86 635-648. · Zbl 1072.62510 · doi:10.1093/biomet/86.3.635
[31] Brown, P. J., Vannucci, M. and Fearn, T. (1998). Multivariate Bayesian variable selection and prediction. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 60 627-641. · Zbl 0909.62022 · doi:10.1111/1467-9868.00144
[32] Brown, P. J., Vannucci, M. and Fearn, T. (2002). Bayes model averaging with selection of regressors. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 64 519-536. · Zbl 1073.62004 · doi:10.1111/1467-9868.00348
[33] Burman, P. (1989). A Comparative Study of Ordinary Cross-Validation, \(v\)-Fold Cross-Validation and the Repeated Learning-Testing Methods. Biometrika 76 503-514. · Zbl 0677.62065 · doi:10.1093/biomet/76.3.503
[34] Burman, P., Chow, E. and Nolan, D. (1994). A Cross-Validatory Method for Dependent Data. Biometrika 81 351-358. · Zbl 0825.62669 · doi:10.1093/biomet/81.2.351
[35] Burman, P. and Nolan, D. (1992). Data dependent estimation of prediction functions. Journal of Time Series Analysis 13 189-207. · Zbl 0754.62018 · doi:10.1111/j.1467-9892.1992.tb00102.x
[36] Burnham, K. P. and Anderson, D. R. (1998). Model selection and inference . Springer. · Zbl 0920.62006
[37] Burnham, K. P. and Anderson, D. R. (2002). Model Selection and Multi-Model Inference: A Practical Information-Theoretic Approach , 2nd ed. Springer. · Zbl 1005.62007 · doi:10.1007/b97636
[38] Carlin, B. P. and Louis, T. A. (1996). Bayes and Empirical Bayes Methods for Data Analysis 69 . Chapman & Hall. · Zbl 0871.62012
[39] Carlin, B. P. and Spiegelhalter, D. J. (2007). Discussion to ‘Estimating the Integrated Likelihood via Posterior Simulation Using the Harmonic Mean Identity’. In Bayesian Statistics 8 ( J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.) 33-36. Oxford University Press.
[40] Cawley, G. C. and Talbot, N. L. C. (2010). On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation. Journal of Machine Learning Research 11 2079-2107. · Zbl 1242.62051 · www.jmlr.org
[41] Celeux, G., Forbes, F., Robert, C. P. and Titterington, D. M. (2006). Deviance Information Criteria for Missing Data Models. Bayesian Analysis 1 651-674. · Zbl 1331.62329 · doi:10.1214/06-BA122
[42] Chakrabarti, A. and Ghosh, J. K. (2007). Some Aspects of Bayesian Model Selection for Prediction. In Bayesian Statistics 8 ( J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.) 51-90. Oxford University Press. · Zbl 1252.62014
[43] Chen, M.-H., Dey, D. K. and Ibrahim, J. G. (2004). Bayesian criterion based model assessment for categorical data. 91 45-63. · Zbl 1132.62301 · doi:10.1093/biomet/91.1.45 · biomet.oxfordjournals.org
[44] Chen, M.-H., Shao, Q.-M. and Ibrahim, J. Q. (2000). Monte Carlo Methods in Bayesian Computation . Springer-Verlag. · Zbl 0949.65005 · doi:10.1007/978-1-4612-1276-8
[45] Chow, G. C. (1981). A comparison of the information and posterior probability criteria for model selection. Journal of Econometrics 16 21-33. · Zbl 0457.62033 · doi:10.1016/0304-4076(81)90073-7
[46] Corander, J. and Marttinen, P. (2006). Bayesian Model Learning Based on Predictive Entropy. Journal of Logic, Language, and Information 15 5-20. · Zbl 1100.62031 · doi:10.1007/s10849-005-9004-8 · www.jstor.org
[47] Dietterich, T. G. (1998). Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation 10 1895-1924.
[48] Draper, D. and Fouskakis, D. (2000). A Case Study of Stochastic Optimization in Health Policy: Problem Formulation and Preliminary Results. Journal of Global Optimization 18 399-416. · Zbl 1179.90244 · doi:10.1023/A:1026504402220
[49] Dupuis, J. A. and Robert, C. P. (1997). Bayesian Variable Selection in Qualitative Models by Kullback-Leibler Projections Working papers, Centre de Recherche en Economie et Statistique.
[50] Dupuis, J. A. and Robert, C. P. (2003). Variable selection in qualitative models via an entropic explanatory power. Journal of Statistical Planning and Inference 111 77-94. · Zbl 1033.62066 · doi:10.1016/S0378-3758(02)00286-0
[51] Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap 57 . Chapman & Hall. · Zbl 0835.62038
[52] Epifani, I., MacEachern, S. N. and Peruggia, M. (2008). Case-deletion importance sampling estimators: Central limit theorems and related results. Electronic Journal of Statistics 2 774-806. · Zbl 1320.62046 · doi:10.1214/08-EJS259
[53] Fearn, T., Brown, P. J. and Besbeas, P. (2002). A Bayesian decision theory approach to variable selection for discrimination. Statistics and Computing 12 253-260. · doi:10.1023/A:1020702927247
[54] Fouskakis, D. and Draper, D. (2008). Comparing stochastic optimization methods for variable selection in binary outcome prediction with application to health policy. Journal of the American Statistical Association 103 1367-1381. · Zbl 1286.62065 · doi:10.1198/016214508000001048
[55] Fouskakis, D., Ntzoufras, I. and Draper, D. (2009). Population-based reversible-jump Markov chain Monte Carlo for Bayesian variable selection and evaluation under cost limit restrictions. Journal of the Royal Statistical Society, Series C: Applied Statistics 58 383-403. · doi:10.1111/j.1467-9876.2008.00658.x
[56] Friel, N. and Wyse, J. (2012). Estimating the evidence - a review. Statistica Neerlandica . Early view online. DOI: 10.1111/j.1467-9574.2011.00515.x. · doi:10.1111/j.1467-9574.2011.00515.x
[57] Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing 21 137-146. · Zbl 1254.62099 · doi:10.1007/s11222-009-9153-8
[58] Geisser, S. (1975). The Predictive Sample Reuse Method with Applications. Journal of the American Statistical Association 70 320-328. · Zbl 0321.62077 · doi:10.2307/2285815
[59] Geisser, S. and Eddy, W. F. (1979). A Predictive Approach to Model Selection. Journal of the American Statistical Association 74 153-160. · Zbl 0401.62036 · doi:10.2307/2286745
[60] Gelfand, A. E. (1996). Model determination using sampling-based methods. In Markov Chain Monte Carlo in Practice ( W. R. Gilks, S. Richardson and D. J. Spiegelhalter, eds.) 145-162. Chapman & Hall. · Zbl 0840.62003
[61] Gelfand, A. E. (2003). Some comments on model criticism. In Highly Structured Stochastic Systems ( P. J. Green, N. L. Hjort and S. Richardson, eds.) 449-453. Oxford University Press.
[62] Gelfand, A. E., Dey, D. K. and Chang, H. (1992). Model Determination using Predictive Distributions with Implementation via Sampling-Based Methods (with discussion). In Bayesian Statistics 4 ( J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 147-167. Oxford University Press.
[63] Gelfand, A. E. and Dey, D. K. (1994). Bayesian Model Choice: Asymptotics and Exact Calculations. Journal of the Royal Statistical Society. Series B (Methodological) 56 501-514. · Zbl 0800.62170
[64] Gelfand, A. E. and Ghosh, S. K. (1998). Model Choice: A Minimum Posterior Predictive Loss Approach. Biometrika 85 1-11. · Zbl 0904.62036 · doi:10.1093/biomet/85.1.1 · www3.oup.co.uk
[65] Gelman, A., Meng, X.-L. and Stern, H. (1996). Posterior Predictive Assessment of Model Fitness via Realized Discrepancies (with discussion). Statistica Sinica 6 733-807. · Zbl 0859.62028
[66] Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. R. (1995). Bayesian Data Analysis . Chapman & Hall. · Zbl 1279.62004
[67] Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. R. (2003). Bayesian Data Analysis , 2nd ed. Chapman & Hall. · Zbl 1279.62004
[68] George, E. I. and McCulloch, R. E. (1993). Variable Selection Via Gibbs Sampling. Journal of the American Statistical Association 88 881-889.
[69] Geweke, J. (1989). Bayesian Inference in Econometric Models Using Monte Carlo Integration. Econometrica 57 1317-1339. · Zbl 0683.62068 · doi:10.2307/1913710
[70] Gneiting, T. (2011). Making and Evaluating Point Forecasts. Journal of the American Statistical Association 106 746-762. · Zbl 1232.62028 · doi:10.1198/jasa.2011.r10138
[71] Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society Series B: Statistical Methodology 69 243-268. · Zbl 1120.62074 · doi:10.1111/j.1467-9868.2007.00587.x
[72] Gneiting, T. and Raftery, A. E. (2007). Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of American Statistical Association 102 359-378. · Zbl 1284.62093 · doi:10.1198/016214506000001437
[73] Good, I. J. (1952). Rational Decisions. Journal of the Royal Statistical Society. Series B (Methodological) 14 107-114.
[74] Goutis, C. and Robert, C. P. (1998). Model choice in generalised linear models: A Bayesian approach via Kullback-Leibler projections. Biometrika 85 29-37. · Zbl 0903.62061 · doi:10.1093/biomet/85.1.29 · www3.oup.co.uk
[75] Grünwald, P. D. and Dawid, A. P. (2004). Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory. Annals of Statistics 32 1367-1433. · Zbl 1048.62008 · doi:10.1214/009053604000000553
[76] Gutiérrez-Peña, E. (1992). Expected logarithmic divergence for exponential families. In Bayesian Statistics 4 ( J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 669-674. Oxford University Press.
[77] Gutiérrez-Peña, E. (1997). A Bayesian Predictive Semiparametric Approach to Variable Selection and Model Comparison in Regression. In Bulletin of the International Statistical Institute, Tome LVII. (Proceedings of the 51st Session of the ISI, Invited Papers, Book 1.) 17-29. · Zbl 0917.62021
[78] Gutiérrez-Peña, E. and Walker, S. G. (2001). A Bayesian predictive approach to model selection. Journal of Statistical Planning and Inference 93 259-276. · Zbl 1072.62537 · doi:10.1016/S0378-3758(00)00172-5
[79] Gutiérrez-Peña, E. and Walker, S. G. (2005). Statistical decision problems and Bayesian nonparametric methods. International Statistical Review 73 309-330. · Zbl 1105.62006 · doi:10.1111/j.1751-5823.2005.tb00151.x
[80] Guttman, I. (1967). The Use of the Concept of a Future Observation in Goodness-of-Fit Problems. Journal of the Royal Statistical Society. Series B (Methodological) 29 83-100. · Zbl 0158.37305
[81] Han, C. and Carlin, B. P. (2000). MCMC methods for computing Bayes factors: A comparative review Research Report No. 2000-001, Division of Biostatistics, University of Minnesota.
[82] Held, L., Schrödle, B. and Rue, H. (2010). Posterior and Cross-validatory Predictive Checks: A Comparison of MCMC and INLA. In Statistical Modelling and Regression Structures ( T. Kneib and G. Tutz, eds.) 91-110. Springer. · doi:10.1007/978-3-7908-2413-1_6
[83] Hoeting, J., Madigan, D., Raftery, A. and Volinsky, C. (1999). Bayesian Model Averaging: A Tutorial. Statistical Science 14 382-401. · Zbl 1059.62525 · doi:10.1214/ss/1009212519
[84] Hurvich, C. M. and Tsai, C.-L. (1989). Regression and Time Series Model Selection in Small Samples. Biometrika 76 297-307. · Zbl 0669.62085 · doi:10.1093/biomet/76.2.297
[85] Hurvich, C. M. and Tsai, C.-L. (1991). Bias of the Corrected AIC Criterion for Underfitted Regression and time Series Models. Biometrika 78 499-509. · Zbl 1193.62159 · doi:10.1093/biomet/78.3.499
[86] Ibrahim, J. G. and Chen, M.-H. (1997). Predictive Variable Selection for the Multivariate Linear Model. Biometrics 53 465-478. · Zbl 0878.62022 · doi:10.2307/2533950 · www.jstor.org
[87] Ibrahim, J. G., Chen, M.-H. and Sinha, D. (2001). Criterion-based methods for Bayesian model assessment. Statistica Sinica 11 419-443. · Zbl 1037.62017
[88] Ibrahim, J. G. and Laud, P. W. (1994). A Predictive Approach to the Analysis of Designed Experiments. Journal of the American Statistical Association 89 309-319. · Zbl 0791.62080 · doi:10.2307/2291227
[89] Jaakkola, T. S. (2001). Tutorial on variational approximation methods. In Advanced Mean Field Methods ( M. Opper and D. Saad, eds.) 129-160. The MIT Press.
[90] Jeffreys, H. (1961). Theory of Probability , 3rd ed. Oxford University Press (1st edition 1939). · Zbl 0116.34904
[91] Jonathan, P., Krzanowski, W. J. and McCarthy, W. V. (2000). On the use of cross-validation to assess performance in multivariate prediction. Statistics and Computing 10 209-229.
[92] Jordan, M. I., Ghahramani, Z., Jaakkola, T. S. and Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning 37 183-233. · Zbl 1033.68081 · doi:10.1023/A:1020281327116
[93] Jylänki, P., Vanhatalo, J. and Vehtari, A. (2011). Gaussian Process Regression with a Student-t Likelihood. Journal of Machine Learning Research 12 3227-3257. · Zbl 1280.60025
[94] Karabatsos, G. (2006). Bayesian nonparametric model selection and model testing. Journal of Mathematical Psychology 50 . · Zbl 1099.62005 · doi:10.1016/j.jmp.2005.07.003
[95] Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association 90 773-795. · Zbl 0846.62028
[96] Key, J. T., Pericchi, L. R. and Smith, A. F. M. (1999). Bayesian Model Choice: What and Why? In Bayesian Statistics 6 ( J. M. Bernardo, J. O. Berger and A. P. Dawid, eds.) 343-370. Oxford University Press. · Zbl 0956.62007
[97] Kullback, S. and Leibler, R. A. (1951). On Information and Sufficiency. Annals of Mathematical Statistics 22 79-86. · Zbl 0042.38403 · doi:10.1214/aoms/1177729694
[98] Lacoste-Julien, S., Huszár, F. and Ghahramani, Z. (2011). Approximate inference for the loss-calibrated Bayesian. Journal of Machine Learning Research: Workshop and Conference Proceedings 15 416-424. AISTATS 2011 special issue.
[99] Laud, P. and Ibrahim, J. (1995). Predictive model selection. Journal of the Royal Statistical Society. Series B (Methodological) 57 247-262. · Zbl 0809.62024
[100] Leamer, E. E. (1979). Information Criteria for Choice of Regression Models: A Comment. Econometrica 47 507-510. · Zbl 0446.62071 · doi:10.2307/1914197
[101] Leung, D. H.-Y. (2005). Cross-validation in nonparametric regression with outliers. Annals of Statistics 33 2291-2310. · Zbl 1086.62055 · doi:10.1214/009053605000000499
[102] Lindley, D. V. (1968). The Choice of Variables in Multiple Regression. Journal of the Royal Statistical Society. Series B (Methodological) 30 31-66. · Zbl 0155.26702
[103] Lo, A. Y. (1987). A Large Sample Study of the Bayesian Bootstrap. Annals of Statistics 15 360-375. · Zbl 0617.62032 · doi:10.1214/aos/1176350271
[104] MacKay, D. J. C. (1992). A Practical Bayesian Framework for Backpropagation Networks. Neural Computation 4 448-472.
[105] Marin, J.-M. and Robert, C. P. (2010). Importance sampling methods for Bayesian discrimination between embedded models. In Frontiers of Statistical Decision Making and Bayesian Analysis ( M. H. Chen, D. K. Dey, P. Müller, D. Sun and K. Ye, eds.) 14, 513-553. Springer.
[106] Marriott, J. M., Spencer, N. M. and Pettitt, A. N. (2001). A Bayesian Approach to Selecting Covariates for Prediction. Scandinavian Journal of Statistics 28 87-97. · Zbl 0965.62024 · doi:10.1111/1467-9469.00225
[107] Marshall, E. C. and Spiegelhalter, D. J. (2003). Approximate cross-validatory predictive checks in disease mapping models. Statistics in Medicine 22 1649-1660.
[108] San Martini, A. and Spezzaferri, F. (1984). A Predictive Model Selection Criterion. Journal of the Royal Statistical Society. Series B (Methodological) 46 296-303. · Zbl 0566.62004
[109] Mason, D. M. and Newton, M. A. (1992). A Rank Statistics Approach to the Consistency of a General Bootstrap. Annals of Statistics 20 1611-1624. · Zbl 0777.62045 · doi:10.1214/aos/1176348787
[110] McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models 37 , Second ed. Chapman & Hall. · Zbl 0744.62098
[111] McCulloch, R. E. (1989). Local Model Influence. Journal of the American Statistical Association 84 473-478. · www.jstor.org
[112] Meng, X.-L. (1994). Posterior Predictive \(p\)-Values. Annals of Statistics 22 1142-1160. · Zbl 0820.62027 · doi:10.1214/aos/1176325622
[113] Meyer, M. C. and Laud, P. W. (2002). Predictive Variable Selection in Generalized Linear Models. Journal of the American Statistical Association 97 859-871. · Zbl 1048.62071 · doi:10.1198/016214502388618654
[114] Minka, T. (2001). A Family of Algorithms for Approximate Bayesian Inference PhD thesis, Massachusetts Institute of Technology.
[115] Mitchell, T. J. and Beauchamp, J. J. (1988). Bayesian Variable Selection in Linear Regression (with discussion). Journal of the American Statistical Association 83 . · Zbl 0673.62051 · doi:10.2307/2290129
[116] Miyamoto, J. M. (1999). Quality-Adjusted Life Years (QALY) Utility Models under Expected Utility and Rank Dependent Utility Assumptions. Journal of Mathematical Psychology 43 201-237. · Zbl 0942.91032 · doi:10.1006/jmps.1999.1256
[117] Moody, J. E. (1992). The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems. In Advances in Neural Information Processing Systems 4 ( J. E. Moody, S. J. Hanson and R. P. Lippmann, eds.) 847-854. Morgan Kaufmann Publishers.
[118] Murata, N., Yoshizawa, S. and Amari, S.-I. (1994). Network Information Criterion-Determining the number of hidden units for an Artificial Neural Network model. IEEE Transactions on Neural Networks 5 865-872.
[119] Nadeau, C. and Bengio, S. (2000). Inference for the Generalization Error. In Advances in Neural Information Processing Systems 12 ( S. A. Solla, T. K. Leen and K.-R. Müller, eds.) 307-313. MIT Press. · Zbl 1039.68104
[120] Neal, R. M. (1998). Assessing Relevance Determination Methods Using DELVE. In Neural Networks and Machine Learning ( C. M. Bishop, ed.) 97-129. Springer-Verlag. · Zbl 0928.68092
[121] Newton, M. A. and Raftery, A. E. (1994). Approximate Bayesian Inference with the Weighted Likelihood Bootstrap (with discussion). Journal of the Royal Statistical Society. Series B (Methodological) 56 3-48. · Zbl 0788.62026
[122] Nickisch, H. and Rasmussen, C. E. (2008). Approximations for Binary Gaussian Process Classification. Journal of Machine Learning Research 9 2035-2078. · Zbl 1225.62087 · www.jmlr.org
[123] Nott, D. J. and Leng, C. (2010). Bayesian projection approaches to variable selection in generalized linear models. Computational Statistics & Data Analysis 54 3227-3241. · Zbl 1284.62461 · dx.doi.org
[124] O’Hagan, A. (1995). Fractional Bayes Factors for Model Comparison (with discussion). Journal of the Royal Statistical Society. Series B (Methodological) 57 99-138. · Zbl 0813.62026
[125] O’Hagan, A. (2003). HSSS model criticism. In Highly Structured Stochastic Systems ( P. J. Green, N. L. Hjort and S. Richardson, eds.) 423-444. Oxford University Press.
[126] O’Hagan, A. and Forster, J. (2004). Bayesian Inference , 2nd ed. Kendalls’s Advanced Theory of Statistics 2B . Arnold.
[127] Opper, M. and Winther, O. (2000). Gaussian Processes for Classification: Mean-Field Algorithms. Neural Computation 12 2655-2684.
[128] Orr, M. J. L. (1996). Introduction to Radial Basis Function Networks [online] Technical Report, Centre for Cognitive Science, University of Edinburgh. April 1996. Available at . · www.anc.ed.ac.uk
[129] Peruggia, M. (1997). On the Variability of Case-Deletion Importance Sampling Weights in the Bayesian Linear Model. Journal of the American Statistical Association 92 199-207. · Zbl 0889.62020 · doi:10.2307/2291464
[130] Plummer, M. (2008). Penalized loss functions for Bayesian model comparison. Biostatistics (Oxford, England) 9 523-39. · Zbl 1143.62003 · doi:10.1093/biostatistics/kxm049
[131] Raftery, A. E. and Zheng, Y. (2003). Discussion: Performance Of Bayesian Model Averaging. Journal of American Statistical Association 98 931-938.
[132] Raftery, A. E., Newton, M. A., Satagopan, J. M. and Krivitsky, P. (2007). Estimating the Integrated Likelihood via Posterior Simulation Using the Harmonic Mean Identity (with discussion). In Bayesian Statistics 8 ( J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.) 1-45. Oxford University Press. · Zbl 1252.62038
[133] Raiffa, H. and Schlaifer, R. (2000). Applied Statistical Decision Theory . John Wiley & Sons. · Zbl 0952.62008
[134] Rasmussen, C. E. and Ghahramani, Z. (2003). Bayesian Monte Carlo. In Advances in Neural Information Processing Systems 15 ( S. Becker, S. Thrun and K. Obermayer, eds.) 489-496. MIT Press, Cambridge, MA.
[135] Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning . The MIT Press. · Zbl 1177.68165
[136] Rasmussen, C. E., Neal, R. M., Hinton, G. E., van Camp, D., Revow, M., Ghahramani, Z., Kustra, R. and Tibshirani, R. (1996). The DELVE Manual [online]. Version 1.1. Available at . · ftp://ftp.cs.utoronto.ca/pub/neuron/delve/doc/manual.ps.gz
[137] Rencher, A. C. and Pun, F. C. (1980). Inflation of \(R^{2}\) in Best Subset Regression. Technometrics 22 49-53. · Zbl 0438.62058 · doi:10.2307/1268382
[138] Reunanen, J. (2003). Overfitting in Making Comparisons Between Variable Selection Methods. Journal of Machine Learning Research 3 1371-1382. · Zbl 1102.68635 · doi:10.1162/153244303322753715
[139] Richardson, S. (2002). Discussion to ‘Bayesian measures of model complexity and fit’ by Spiegelhalter et al. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 64 626-627. · Zbl 1067.62010 · doi:10.1111/1467-9868.00353
[140] Robert, C. P. (1996). Intrinsic losses. Theory and decision 40 191-214. · Zbl 0848.90010 · doi:10.1007/BF00133173
[141] Robert, C. P. (2001). The Bayesian Choice: from Decision-Theoretic Motivations to Computational Implementation , 2nd ed. Springer. · Zbl 0980.62005
[142] Robert, C. P. and Wraith, D. (2009). Computational methods for Bayesian model choice. In The 29th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering . AIP Proceedings 1193 251-262.
[143] Robins, J. M., van Der Vaart, A. and Ventura, V. (2000). Asymptotic Distribution of P Values in Composite Null Models. Journal of the American Statistical Association 95 1143-1156. · Zbl 1072.62522 · doi:10.2307/2669750
[144] Rubin, D. B. (1981). The Bayesian Bootstrap. Annals of Statistics 9 130-134. · doi:10.1214/aos/1176345338
[145] Rubin, D. B. (1984). Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician. Annals of Statistics 12 1151-1172. · Zbl 0555.62010 · doi:10.1214/aos/1176346785
[146] Rueda, R. (1992). A Bayesian alternative to parametric hypothesis testing. Test 1 61-67. · Zbl 0764.62003 · doi:10.1007/BF02562662 · www.springerlink.com
[147] Sawa, T. (1978). Information Criteria for Discriminating Among Alternative Regression Models. Econometrica 46 1273-1291. · Zbl 0393.62025 · doi:10.2307/1913828
[148] Shao, J. (1993). Linear Model Selection by Cross-Validation. Journal of the American Statistical Association 88 486-494. · Zbl 0773.62051 · doi:10.2307/2290328
[149] Shen, X., Huang, H.-C. and Ye, J. (2004). Inference after Model Selection. Journal of the American Statistical Association 99 751-762. · Zbl 1117.62423 · doi:10.1198/016214504000001097 · masetto.asa.catchword.org
[150] Shibata, R. (1989). Statistical aspects of model selection. In From data to model ( J. C. Willems, ed.) 215-240. Springer-Verlag.
[151] Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90 227-244. · Zbl 0958.62011 · doi:10.1016/S0378-3758(00)00115-4
[152] Sinha, D., Chen, M.-H. and Ghosh, S. K. (1999). Bayesian Analysis and Model Selection for Interval-Censored Survival Data. Biometrics 55 585-590. · Zbl 1059.62715 · doi:10.1111/j.0006-341X.1999.00585.x
[153] Skare, Ø., Bølviken, E. and Holden, L. (2003). Improved sampling-importance resampling and reduced bias importance sampling. Scandivanian Journal of Statistics 30 719-737. · Zbl 1055.65019 · doi:10.1111/1467-9469.00360
[154] Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2002). Bayesian measures of model complexity and fit (with discussion). Journal of the Royal Statistical Society. Series B (Statistical Methodology) 64 583-639. · Zbl 1067.62010 · doi:10.1111/1467-9868.00353
[155] Stern, H. S. and Cressie, N. (2000). Posterior predictive model checks for disease mapping models. Statistics in Medicine 19 2377-2397.
[156] Stone, M. (1974). Cross-Validatory Choice and Assessment of Statistical Predictions (with discussion). Journal of the Royal Statistical Society. Series B (Methodological) 36 111-147. · Zbl 0308.62063
[157] Stone, M. (1977). An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike’s Criterion. Journal of the Royal Statistical Society. Series B (Methodological) 39 44-47. · Zbl 0355.62002
[158] Sugiyama, M., and Müller, K.-R. (2005). Input-dependent estimation of generalization error under covariate shift. Statistics & Decisions 23 249-279. · Zbl 1117.62069
[159] Sugiyama, M., Krauledat, M. and Müller, K.-R. (2007). Covariate Shift Adaptation by Importance Weighted Cross Validation. Journal of Machine Learning Research 8 985-1005. · Zbl 1222.68313 · www.jmlr.org
[160] Sundararajan, S. and Keerthi, S. S. (2001). Predictive Approaches for Choosing Hyperparameters in Gaussian Processes. Neural Computation 13 1103-1118. · Zbl 1108.62327 · doi:10.1162/08997660151134343
[161] Takeuchi, K. (1976). Distribution of Informational Statistics and a Criterion of Model Fitting (in Japanese). Suri-Kagaku (Mathematic Sciences) 153 12-18.
[162] Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological) 58 267-288. · Zbl 0850.62538
[163] Tibshirani, R. J. and Tibshirani, R. (2009). A bias correction for the minimum error rate in cross-validation. Annals of Applied Statistics 3 822-829. · Zbl 1166.62311 · doi:10.1214/08-AOAS224
[164] Tierney, L. and Kadane, J. B. (1986). Accurate Approximations for Posterior Moments and Marginal Densities. Journal of the American Statistical Association 81 82-86. · Zbl 0587.62067 · doi:10.2307/2287970
[165] Tran, M.-N., Nott, D. J. and Leng, C. (2011). The predictive Lasso. Statistics and Computing 1-16. · Zbl 1252.62075 · doi:10.1007/s11222-011-9279-3
[166] Trottini, M. and Spezzaferri, F. (2002). A generalized predictive criterion for model selection. The Canadian Journal of Statistics 30 79-96. · Zbl 1125.62306 · doi:10.2307/3315866
[167] Vanhatalo, J., Pietiläinen, V. and Vehtari, A. (2010). Approximate inference for disease mapping with sparse Gaussian processes. Statistics in Medicine 29 1580-1607.
[168] Vannucci, M., Brown, P. J. and Fearn, T. (2003). A Decision theoretical approach to wavelet regression on curves with a high number of regressors. Journal of Statistical Planning and Inference 112 195-212. · Zbl 1032.62004 · doi:10.1016/S0378-3758(02)00333-6
[169] Varma, S. and Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics 7 91. · www.ncbi.nlm.nih.gov
[170] Vehtari, A. (2002). Discussion of “Bayesian measures of model complexity and fit” by Spiegelhalter et al. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 64 620. · Zbl 1067.62010 · doi:10.1111/1467-9868.00353
[171] Vehtari, A. and Lampinen, J. (2002). Bayesian Model Assessment and Comparison Using Cross-Validation Predictive Densities. Neural Computation 14 2439-2468. · Zbl 1002.62029 · doi:10.1162/08997660260293292
[172] Vehtari, A. and Lampinen, J. (2004). Model Selection via Predictive Explanatory Power Technical Report No. B38, Helsinki University of Technology, Laboratory of Computational Engineering. · Zbl 1002.62029
[173] Vlachos, P. K. and Gelfand, A. E. (2003). On the Calibration of Bayesian Model Choice Criteria. Journal of Statistical Planning and Inference 111 223-234. · Zbl 1033.62028 · doi:10.1016/S0378-3758(02)00304-X
[174] Watanabe, S. (2009). Algebraic Geometry and Statistical Learning Theory . Cambridge University Press. · Zbl 1180.93108 · doi:10.1017/CBO9780511800474
[175] Watanabe, S. (2010a). Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory. Journal of Machine Learning Research 11 3571-3594. · Zbl 1242.62024 · www.jmlr.org
[176] Watanabe, S. (2010b). Equations of states in singular statistical estimation. Neural Networks 23 20-34.
[177] Watanabe, S. (2010c). A limit theorem in singular regression problem. Advanced Studies of Pure Mathematics 57 473-492. · Zbl 1210.62102
[178] Weng, C.-S. (1989). On a Second-Order Asymptotic Property of the Bayesian Bootstrap Mean. Annals of Statistics 17 705-710. · Zbl 0672.62027 · doi:10.1214/aos/1176347136
[179] Yang, Y. (2005). Can the Strengths of AIC and BIC Be Shared? A Conflict between Model Indentification and Regression Estimation. Biometrika 92 937-950. · Zbl 1151.62301 · doi:10.1093/biomet/92.4.937
[180] Yang, Y. (2007). Consistency of Cross Validation for Comparing Regression Procedures. The Annals of Statistics 35 2450-2473. · Zbl 1129.62039 · doi:10.1214/009053607000000514 · euclid:aos/1201012968
[181] Young, A. S. (1987a). On a Bayesian criterion for choosing predictive sub-models in linear regression. Metrika 34 325-339. · doi:10.1007/BF02613164
[182] Young, A. S. (1987b). On the information criterion for selecting regressors. Metrika 34 185-194. · doi:10.1007/BF02613148
[183] Zhu, L. and Carlin, B. P. (2000). Comparing hierarchical models for spatio-temporally misaligned data using the deviance information criterion. Statistics in Medicine 19 2265-2278.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.