×

zbMATH — the first resource for mathematics

Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. (English) Zbl 1200.62020
Summary: This paper studies the multiplicity-correction effect of standard Bayesian variable-selection priors in linear regression. Our first goal is to clarify when, and how, multiplicity correction happens automatically in Bayesian analysis, and to distinguish this correction from the Bayesian Ockham’s-razor effect. Our second goal is to contrast empirical-Bayes and fully Bayesian approaches to variable selection through examples, theoretical results and simulations. Considerable differences between the two approaches are found. In particular, we prove a theorem that characterizes a surprising aymptotic discrepancy between fully Bayes and empirical Bayes. This discrepancy arises from a different source than the failure to account for hyperparameter uncertainty in the empirical-Bayes estimate. Indeed, even at the extreme, when the empirical-Bayes estimate converges asymptotically to the true variable-inclusion probability, the potential for a serious difference remains.

MSC:
62F15 Bayesian inference
62C12 Empirical decision procedures; empirical Bayes procedures
62J05 Linear regression; mixed models
62J15 Paired and multiple comparisons; multiple testing
65C60 Computational problems in statistics (MSC2010)
Software:
EBayesThresh
PDF BibTeX XML Cite
Full Text: DOI arXiv
References:
[1] Barbieri, M. and Berger, J. O. (2004). Optimal predictive model selection. Ann. Statist. 32 870-897. · Zbl 1092.62033
[2] Berger, J., Pericchi, L. and Varshavsky, J. (1998). Bayes factors and marginal distributions in invariant situations. Sankhyā Ser. A 60 307-321. · Zbl 0973.62017
[3] Berger, J. O. (1985). Statistical Decision Theory and Bayesian Analysis , 2nd ed. Springer, New York. · Zbl 0572.62008
[4] Berger, J. O. and Molina, G. (2005). Posterior model probabilities via path-based pairwise priors. Statist. Neerlandica 59 3-15. · Zbl 1069.62021
[5] Berry, D. (1988). Multiple comparisons, multiple tests, and data dredging: A Bayesian perspective. In Bayesian Statistics 3 (J. Bernardo, M. DeGroot, D. Lindley and A. Smith, eds.) 79-94. Oxford Univ. Press, New York. · Zbl 0706.62033
[6] Berry, D. and Hochberg, Y. (1999). Bayesian perspectives on multiple comparisons. J. Statist. Plann. Inference 82 215-277. · Zbl 1063.62527
[7] Bogdan, M., Ghosh, J. K. and Zak-Szatkowska, M. (2008). Selecting explanatory variables with the modified version of the Bayesian information criterion. Quality and Reliability Engineering International 24 627-641.
[8] Bogdan, M., Chakrabarti, A. and Ghosh, J. K. (2008). Optimal rules for multiple testing and sparse multiple regression. Technical Report I-18/08/P-003, Wrocław Univ. Technology.
[9] Bogdan, M., Ghosh, J. K. and Tokdar, S. T. (2008). A comparison of the Benjamini-Hochberg procedure with some Bayesian rules for multiple testing. In Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen 211-230. IMS, Beachwood, OH.
[10] Carlin, B. and Louis, T. (2000). Empirical Bayes: Past, present and future. J. Amer. Statist. Assoc. 95 1286-1289. JSTOR: · Zbl 1072.62511
[11] Carvalho, C. M. and Scott, J. G. (2009). Objective Bayesian model selection in Gaussian graphical models. Biometrika 96 497-512. · Zbl 1170.62020
[12] Casella, G. and Moreno, E. (2002). Objective Bayes variable selection. Technical Report 023, Univ. Florida.
[13] Cui, W. and George, E. I. (2008). Empirical Bayes vs. fully Bayes variable selection. J. Statist. Plann. Inference 138 888-900. · Zbl 1130.62007
[14] Do, K.-A., Muller, P. and Tang, F. (2005). A Bayesian mixture model for differential gene expression. J. Roy. Statist. Soc. Ser. C 54 627-644. JSTOR: · Zbl 05188702
[15] Eaton, M. (1989). Group Invariance Applications in Statistics . IMS, Hayward, CA. · Zbl 0749.62005
[16] Efron, B., Tibshirani, R., Storey, J. and Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. J. Amer. Statist. Assoc. 96 1151-1160. JSTOR: · Zbl 1036.62045
[17] Fernandez, C., Ley, E. and Steel, M. (2001). Model uncertainty in cross-country growth regressions. J. Appl. Econometrics 16 563-576.
[18] George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731-747. JSTOR: · Zbl 1029.62008
[19] Gopalan, R. and Berry, D. (1998). Bayesian multiple comparisons using Dirichlet process priors. J. Amer. Statist. Assoc. 93 1130-1139. JSTOR: · Zbl 1063.62530
[20] Gould, H. (1964). Sums of logarithms of binomial coefficients. Amer. Math. Monthly 71 55-58. JSTOR: · Zbl 0123.00201
[21] Jefferys, W. and Berger, J. (1992). Ockham’s razor and Bayesian analysis. American Scientist 80 64-72.
[22] Jeffreys, H. (1961). Theory of Probability , 3rd ed. Clarendon Press, Oxford. · Zbl 0116.34904
[23] Johnstone, I. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical-Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594-1649. · Zbl 1047.62008
[24] Ley, E. and Steel, M. F. (2009). On the effect of prior assumptions in Bayesian model averaging with applications to growth regression. J. Appl. Econometrics 24 651-674.
[25] Liang, F., Paulo, R., Molina, G., Clyde, M. and Berger, J. (2008). Mixtures of g -priors for Bayesian variable selection. J. Amer. Statist. Assoc. 103 410-423. · Zbl 1335.62026
[26] Meng, C. and Dempster, A. (1987). A Bayesian approach to the multiplicity problem for significance testing with binomial data. Biometrics 43 301-311. JSTOR:
[27] Sala-i Martin, X., Doppelhofer, G. and Miller, R. I. (2004). Determinants of long-term growth: A Bayesian averaging of classical estimates (bace) approach. American Economic Review 94 813-835.
[28] Scott, J. G. (2009). Nonparametric Bayesian multiple testing for longitudinal performance stratification. Ann. Appl. Statist. 3 1655-1674. · Zbl 1184.62156
[29] Scott, J. G. and Berger, J. O. (2006). An exploration of aspects of Bayesian multiple testing. J. Statist. Plann. Inference 136 2144-2162. · Zbl 1087.62039
[30] Scott, J. G. and Carvalho, C. M. (2008). Feature-inclusion stochastic search for Gaussian graphical models. J. Comput. Graph. Statist. 17 790-808.
[31] Waller, R. and Duncan, D. (1969). A Bayes rule for the symmetric multiple comparison problem. J. Amer. Statist. Assoc. 64 1484-1503. JSTOR:
[32] Westfall, P. H., Johnson, W. O. and Utts, J. M. (1997). A Bayesian perspective on the Bonferroni adjustment. Biometrika 84 419-427. JSTOR: · Zbl 0882.62025
[33] Zellner, A. (1986). On assessing prior distributions and Bayesian regression analysis with g -prior distributions. In Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti (P. Goel and A. Zellner, eds.) 233-243. North-Holland, Amsterdam. · Zbl 0655.62071
[34] Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics: Proceedings of the First International Meeting held in Valencia (Spain) (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.) 585-603. Univ. Press, Valencia. · Zbl 0457.62004
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.