×

Training samples in objective Bayesian model selection. (English) Zbl 1092.62034

Summary: Central to several objective approaches to Bayesian model selection is the use of training samples (subsets of the data), so as to allow utilization of improper objective priors. The most common prescription for choosing training samples is to choose them to be as small as possible, subject to yielding proper posteriors; these are called minimal training samples. When data can vary widely in terms of either information content or impact on the improper priors, use of minimal training samples can be inadequate. Important examples include certain cases of discrete data, the presence of censored observations, and certain situations involving linear models and explanatory variables. Such situations require more sophisticated methods of choosing training samples. A variety of such methods are developed in this paper, and successfully applied in challenging situations.

MSC:

62F15 Bayesian inference
62J05 Linear regression; mixed models
62C10 Bayesian problems; characterization of Bayes procedures

References:

[1] Alqallaf, F. and Gustafson, P. (2001). On cross-validation of Bayesian models. Canad. J. Statist. 29 333–340. JSTOR: · Zbl 0974.62019 · doi:10.2307/3316081
[2] Beattie, S. D., Fong, D. K. H. and Lin, D. K. J. (2002). A two-stage Bayesian model selection strategy for supersaturated designs. Technometrics 44 55–63. JSTOR: · doi:10.1198/004017002753398326
[3] Berger, J. and Bernardo, J. (1992). On the development of reference priors. In Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 35–60. Oxford Univ. Press.
[4] Berger, J. and Mortera, J. (1999). Default Bayes factors for nonnested hypothesis testing. J. Amer. Statist. Assoc. 94 542–554. JSTOR: · Zbl 0996.62018 · doi:10.2307/2670175
[5] Berger, J. and Pericchi, L. (1996a). The intrinsic Bayes factor for model selection and prediction. J. Amer. Statist. Assoc. 91 109–122. JSTOR: · Zbl 0870.62021 · doi:10.2307/2291387
[6] Berger, J. and Pericchi, L. (1996b). The intrinsic Bayes factor for linear models. In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 25–44. Oxford Univ. Press. · Zbl 0870.62021
[7] Berger, J. and Pericchi, L. (1996c). On the justification of default and intrinsic Bayes factors. In Modelling and Prediction (J. C. Lee, W. O. Johnson and A. Zellner, eds.) 276–293. Springer, New York. · Zbl 0895.62026
[8] Berger, J. and Pericchi, L. (1998). Accurate and stable Bayesian model selection: The median intrinsic Bayes factor. Sankhy\(\bara\) Ser. B 60 1–18. · Zbl 1081.62517
[9] Berger, J. and Pericchi, L. (2001). Objective Bayesian methods for model selection: Introduction and comparison (with discussion). In Model Selection (P. Lahiri, ed.). 135–207. IMS, Beachwood, OH. · doi:10.1214/lnms/1215540968
[10] Berger, J., Pericchi, L. and Varshavsky, J. (1998). Bayes factors and marginal distributions in invariant situations. Sankhy\(\bara\) Ser. A 60 307–321. · Zbl 0973.62017
[11] Bertolino, F. and Racugno, W. (1996). Is the intrinsic Bayes factor intrinsic? Metron 54 5–15. · Zbl 0881.62026
[12] Bertolino, F. Racugno, W. and Moreno, E., (2000). Bayesian model selection approach to analysis of variance under heteroscedasticity. The Statistician 49 503–517.
[13] Cano, J. A., Kessler, M. and Moreno, E. (2002). On intrinsic priors for nonnested models. Technical report, Univ. Granada. · Zbl 1069.62022
[14] Cox, D. R. and Oakes, D. (1984). Analysis of Survival Data. Chapman and Hall, London.
[15] De Santis, F., Mortera, J. and Nardi, A. (2001). Jeffreys priors for survival models with censored data. J. Statist. Plann. Inference 99 193–209. · Zbl 0989.62016 · doi:10.1016/S0378-3758(01)00080-5
[16] De Santis, F. and Spezzaferri, F. (1997). Alternative Bayes factors for model selection. Canad. J. Statist. 25 503–515. JSTOR: · Zbl 0894.62031 · doi:10.2307/3315344
[17] De Santis, F. and Spezzaferri, F. (1998a). Consistent fractional Bayes factor for linear models. Pubblicazioni Scientifich del Dipartimento di Statistica , Probab. e Stat. Appl., Univ. di Roma, “La Sapienza,” Ser. A n. 19. · Zbl 0945.62030
[18] De Santis, F. and Spezzaferri, F. (1998b). Bayes factors and hierarchical models. J. Statist. Plann. Inference 74 323–342. · Zbl 0945.62030 · doi:10.1016/S0378-3758(98)00109-8
[19] De Santis, F. and Spezzaferri, F. (1999). Methods for default and robust Bayesian model comparison: The fractional Bayes factor approach. Internat. Statist. Rev. 67 267–286. · Zbl 0944.62027 · doi:10.2307/1403706
[20] de Vos, A. F. (1993). A fair comparison between regression models of different dimension. Technical report, The Free University, Amsterdam.
[21] Findley, D. F. (1991). Counterexamples to parsimony and BIC. Ann. Inst. Statist. Math. 43 505–514. · Zbl 0850.62648 · doi:10.1007/BF00053369
[22] Gehan, E. A. (1965). A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika 52 203–223. JSTOR: · Zbl 0133.41901 · doi:10.1093/biomet/52.1-2.203
[23] Gelfand, A. E., Dey, D. K. and Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods. In Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 147–167. Oxford Univ. Press.
[24] Ghosh, J. K. and Samanta, T. (2002). Nonsubjective Bayes testing—an overview. J. Statist. Plann. Inference 103 205–223. · Zbl 0989.62017 · doi:10.1016/S0378-3758(01)00222-1
[25] Girón, F., Martínez, M. L. and Moreno, E. (2003). Bayesian analysis of matched pairs. J. Statist. Plann. Inference 113 49–66. · Zbl 1031.62022 · doi:10.1016/S0378-3758(01)00299-3
[26] Good, I. J. (1950). Probability and the Weighing of Evidence . Hafner, New York. · Zbl 0036.08402
[27] Iwaki, K. (1997). Posterior expected marginal likelihood for testing hypotheses. J. Econ. Asia Univ. 21 105–134.
[28] Iwaki, K. (1999). Noninformative priors for model comparison. Discussion Paper No. 53, Institute of Economic and Social Research, Asia Univ.
[29] Key, J. T., Pericchi, L. R. and Smith, A. F.M. (1999). Bayesian model choice: What and why? In Bayesian Statistics 6 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 343–370. Oxford Univ. Press. · Zbl 0956.62007
[30] Kim, S. and Sun, D. (2000). Intrinsic priors for model selection using an encompassing model with applications to censored failure time data. Lifetime Data Anal. 6 251–269. · Zbl 1079.62506 · doi:10.1023/A:1009641709382
[31] Lingham, R. and Sivaganesan, S. (1997). Testing hypotheses about the power law process under failure truncation using intrinsic Bayes factors. Ann. Inst. Statist. Math. 49 693–710. · Zbl 0908.62026 · doi:10.1023/A:1003218410136
[32] Lingham, R. and Sivaganesan, S. (1999). Intrinsic Bayes factor approach to a test for the power law process. J. Statist. Plann. Inference 77 195–220. · Zbl 1054.62530 · doi:10.1016/S0378-3758(98)00181-5
[33] Moreno, E., Bertolino, F. and Racugno, W. (1998). An intrinsic limiting procedure for model selection and hypothesis testing. J. Amer. Statist. Assoc. 93 1451–1460. JSTOR: · Zbl 1064.62513 · doi:10.2307/2670059
[34] Moreno, E., Bertolino, F., and Racugno, W. (1999). Default Bayesian analysis of the Behrens–Fisher problem. J. Statist. Plann. Inference 81 323–333. · Zbl 0941.62030 · doi:10.1016/S0378-3758(99)00070-1
[35] Moreno, E., Bertolino, F. and Racugno, W. (2001). Inference under partial prior information. Technical report, Univ. Granada. · Zbl 1034.62021
[36] Moreno, E., Girón, F. and Torres, F. (2004). Intrinsic priors for hypothesis testing in normal regression models. Rev. R. Acad. Cienc. Exactas Fís. Nat. \((\)Esp.\()\) . · Zbl 1046.62026
[37] Moreno, E. and Liseo, B. (2003). A default Bayesian test for the number of components in a mixture. J. Statist. Plann. Inference 111 129–142. · Zbl 1033.62025 · doi:10.1016/S0378-3758(02)00294-X
[38] Moreno, E., Torres, F. and Casella, G. (2002). Testing equality of regression coefficients in heteroscedastic normal regression models. Technical report, Univ. Granada. · Zbl 1062.62047
[39] Neal, R. (2001). Transferring prior information between models using imaginary data. Technical Report 0108, Dept. Statistics, Univ. Toronto.
[40] O’Hagan, A. (1995). Fractional Bayes factors for model comparison (with discussion). J. Roy. Statist. Soc. Ser. B 57 99–138. JSTOR: · Zbl 0813.62026
[41] O’Hagan, A. (1997). Properties of intrinsic and fractional Bayes factors. Test 6 101–118. · Zbl 0891.62001 · doi:10.1007/BF02564428
[42] Paulo, R. (2002). Conditional frequentist testing and model validation. Ph.D. dissertation, Duke Univ.
[43] Pérez, J. M. (1998). Development of conventional prior distributions for model comparisons. Ph.D. dissertation, Purdue Univ.
[44] Pérez, J. M. and Berger, J. (2001). Analysis of mixture models using expected posterior priors, with application to classification of gamma ray bursts. In Bayesian Methods, with Applications to Science, Policy and Official Statistics (E. George and P. Nanopoulos, eds.) 401–410. Eurostat, Luxembourg.
[45] Pérez, J. M. and Berger, J. (2002). Expected posterior prior distributions for model selection. Biometrika 89 491–511. JSTOR: · Zbl 1036.62026 · doi:10.1093/biomet/89.3.491
[46] Pericchi, L. R., Fiteni, A. and Presa, E. (1993). The intrinsic Bayes factor explained by examples. Technical report, Dept. Estadística y Econometría, Universidad Carlos III, Madrid.
[47] Rodriguez, A. and Pericchi, L. R. (2001). Intrinsic Bayes factors for dynamic models. In Bayesian Methods, with Applications to Science, Policy and Official Statistics (E. George and P. Nanopoulos, eds.) 459–468. Eurostat, Luxembourg.
[48] Sivaganesan, S. and Lingham, R. (1999). Bayes factors for model selection for some diffusion processes under improper priors. Technical report, Dept. Mathematical Sciences, Univ. Cincinnati. · Zbl 1054.62530
[49] Smith, A. F. M. and Spiegelhalter, D. J. (1980). Bayes factors and choice criteria for linear models. J. Roy. Statist. Soc. Ser. B 42 213–220. JSTOR: · Zbl 0433.62045
[50] Sun, D. and Kim, S. (1997). Intrinsic priors for testing ordered exponential means. Technical report, Dept. Statistics, Univ. Missouri.
[51] Varshavsky, J. (1995). On the development of intrinsic Bayes factors. Ph.D. dissertation, Purdue Univ.
[52] Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics 1 (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.) 585–603. Valencia Univ. Press. · Zbl 0435.00013
[53] Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with \(g\)-prior distributions. In Bayesian Inference and Decision Techniques : Essays in Honor of Bruno de Finetti (P. K. Goel and A. Zellner, eds.) 233–243. North-Holland, Amsterdam. · Zbl 0655.62071
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.