Vovk, Vladimir; Wang, Ruodu Confidence and discoveries with \(e\)-values. (English) Zbl 07708435 Stat. Sci. 38, No. 2, 329-354 (2023). Summary: We discuss systematically two versions of confidence regions: those based on \(p\)-values and those based on \(e\)-values, a recent alternative to \(p\)-values. Both versions can be applied to multiple hypothesis testing, and in this paper we are interested in procedures that control the number of false discoveries under arbitrary dependence between the base \(p\)- or \(e\)-values. We introduce a procedure that is based on \(e\)-values and show that it is efficient both computationally and statistically using simulated and real-world data sets. Comparison with the corresponding standard procedure based on \(p\)-values is not straightforward, but there are indications that the new one performs significantly better in some situations. Cited in 2 Documents MSC: 62-XX Statistics Keywords:Bayes factor; discovery matrix; discovery vector; hypothesis testing; multiple hypothesis testing Software:CRAN; hommel; qvalue × Cite Format Result Cite Review PDF Full Text: DOI arXiv References: [1] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289-300. · Zbl 0809.62014 [2] BENJAMINI, Y. and YEKUTIELI, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165-1188. · Zbl 1041.62061 · doi:10.1214/aos/1013699998 [3] BENJAMINI, Y., VEAUX, R. D. D., EFRON, B., EVANS, S., GLICKMAN, M., GRAUBARD, B. I., HE, X., MENG, X.-L., REID, N. et al. (2021). The ASA president’s task force statement on statistical significance and replicability. Ann. Appl. Stat. 15 1084-1085. · Zbl 1478.62008 [4] BERNARDO, J. M. and SMITH, A. F. M. (2000). Bayesian Theory. Wiley, Chichester. · Zbl 0943.62009 [5] Bernoulli, J. (1713). Ars Conjectandi. Thurnisius, Basel. [6] CASELLA, G. and BERGER, R. L. (2002). Statistical Inference, 2nd ed. Duxbury, Pacific Grove, CA. [7] Cournot, A.-A. (1843). Exposition de la Théorie des Chances et des Probabilités. Hachette, Paris. · Zbl 0174.51702 [8] COX, D. R. and HINKLEY, D. V. (1974). Theoretical Statistics. CRC Press, London. · Zbl 0334.62003 [9] DE FINETTI, B. (2017). Theory of Probability. Wiley Series in Probability and Statistics. Wiley, Chichester. · Zbl 1375.60008 · doi:10.1002/9781119286387 [10] DUBOIS, D. and PRADE, H. (1988). Possibility Theory. Plenum Press, New York. · Zbl 0703.68004 [11] FISHER, R. A. (1973). Statistical Methods and Scientific Inference, 3rd ed. Hafner, New York. · Zbl 0281.62002 [12] GÁCS, P. (2005). Uniform test of algorithmic randomness over a general space. Theoret. Comput. Sci. 341 91-137. · Zbl 1077.68038 · doi:10.1016/j.tcs.2005.03.054 [13] GENOVESE, C. R., ROEDER, K. and WASSERMAN, L. (2006). False discovery control with \(p\)-value weighting. Biometrika 93 509-524. · Zbl 1108.62070 · doi:10.1093/biomet/93.3.509 [14] Genovese, C. and Wasserman, L. (2004). A stochastic process approach to false discovery control. Ann. Statist. 32 1035-1061. · Zbl 1092.62065 · doi:10.1214/009053604000000283 [15] GOEMAN, J. J., HEMERIK, J. and SOLARI, A. (2021). Only closed testing procedures are admissible for controlling false discovery proportions. Ann. Statist. 49 1218-1238. · Zbl 1468.62244 · doi:10.1214/20-aos1999 [16] GOEMAN, J. J., MEIJER, R. and KREBS, T. (2019). \( \mathtt{hommel} \): Methods for closed testing with Simes inequality, in particular Hommel’s method. R package version 1.5, available on CRAN. [17] GOEMAN, J. J., ROSENBLATT, J. D. and NICHOLS, T. E. (2019). The harmonic mean p-value: Strong versus weak control, and the assumption of independence. Proc. Natl. Acad. Sci. USA 116 23382-23383. [18] GOEMAN, J. J. and SOLARI, A. (2011a). Multiple testing for exploratory research. Statist. Sci. 26 584-597. Correction: 28 464. · Zbl 1331.62369 · doi:10.1214/11-STS356 [19] GOEMAN, J. J. and SOLARI, A. (2011b). Multiple testing for exploratory research: Rejoinder. Statist. Sci. 26 608-612. · Zbl 1331.62369 [20] Goeman, J. J., Meijer, R. J., Krebs, T. J. P. and Solari, A. (2019). Simultaneous control of all false discovery proportions in large-scale multiple hypothesis testing. Biometrika 106 841-856. · Zbl 1435.62446 · doi:10.1093/biomet/asz041 [21] GÖNEN, M., JOHNSON, W. O., LU, Y. and WESTFALL, P. H. (2005). The Bayesian two-sample \(t\) test. Amer. Statist. 59 252-257. · doi:10.1198/000313005X55233 [22] GÖNEN, M., JOHNSON, W. O., LU, Y. and WESTFALL, P. H. (2019). Comparing objective and subjective Bayes factors for the two-sample comparison: The classification theorem in action. Amer. Statist. 73 22-31. · Zbl 07588119 · doi:10.1080/00031305.2017.1322142 [23] GOOD, I. J. (1958). Significance tests in parallel and in series. J. Amer. Statist. Assoc. 53 799-813. · Zbl 0092.36205 [24] GRÜNWALD, P., DE HEIDE, R. and KOOLEN, W. M. (2020). Safe testing Technical Report. Available at arXiv:1906.07801 [math.ST]. [25] GRÜNWALD, P. and VAN OMMEN, T. (2017). Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it. Bayesian Anal. 12 1069-1103. · Zbl 1384.62088 · doi:10.1214/17-BA1085 [26] GUINDANI, M., MÜLLER, P. and ZHANG, S. (2009). A Bayesian discovery procedure. J. R. Stat. Soc. Ser. B. Stat. Methodol. 71 905-925. · Zbl 1411.62224 · doi:10.1111/j.1467-9868.2009.00714.x [27] HEDENFALK, I., DUGGAN, D., CHEN, Y., RADMACHER, M., BITTNER, M., SIMON, R., MELTZER, P., GUSTERSON, B., ESTELLER, M. et al. (2001). Gene-expression profiles in hereditary breast cancer. N. Engl. J. Med. 344 539-548. [28] HELD, L. (2019). On the Bayesian interpretation of the harmonic mean \(p\)-value. Proc. Natl. Acad. Sci. USA 116 5855-5856. · Zbl 1431.62238 · doi:10.1073/pnas.1900671116 [29] HEMERIK, J. and GOEMAN, J. (2018). Exact testing with random permutations. TEST 27 811-825. · Zbl 1420.62188 · doi:10.1007/s11749-017-0571-1 [30] Hemerik, J., Solari, A. and Goeman, J. J. (2019). Permutation-based simultaneous confidence bounds for the false discovery proportion. Biometrika 106 635-649. · Zbl 1464.62276 · doi:10.1093/biomet/asz021 [31] Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6 65-70. · Zbl 0402.62058 [32] HOMMEL, G. (1986). Multiple test procedures for arbitrary dependence structures. Metrika 33 321-336. · Zbl 0603.62026 · doi:10.1007/BF01894765 [33] Jeffreys, H. (1961). Theory of Probability, 3rd ed. Clarendon Press, Oxford. · Zbl 0116.34904 [34] KLEENE, S. C. (1967). Mathematical Logic. Wiley, New York. · Zbl 0149.24309 [35] LEHMANN, E. L. (2011). Fisher, Neyman, and the Creation of Classical Statistics. Springer, New York. · Zbl 1277.62028 · doi:10.1007/978-1-4419-9500-1 [36] LEHMANN, E. L. and ROMANO, J. P. (2022). Testing Statistical Hypotheses, 4th ed. Springer, Cham. · Zbl 1491.62003 [37] LEVIN, L. A. (1976). Uniform tests of randomness. Sov. Math., Dokl. 17 337-340. · Zbl 0341.94013 [38] LY, A., VERHAGEN, J. and WAGENMAKERS, E.-J. (2016). Harold Jeffreys’s default Bayes factor hypothesis tests: Explanation, extension, and application in psychology. J. Math. Psych. 72 19-32. · Zbl 1357.62117 · doi:10.1016/j.jmp.2015.06.004 [39] NEYMAN, J. (1934). On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection (with discussion). J. R. Stat. Soc. 97 558-625. · Zbl 0010.07201 [40] NEYMAN, J. (1937). Outline of a theory of statistical estimation based on the classical theory of probability. Philos. Trans. R. Soc. Lond. A 236 333-380. · Zbl 0017.12403 [41] NEYMAN, J. (1941). Fiducial argument and the theory of confidence intervals. Biometrika 32 128-150. · Zbl 0063.05944 · doi:10.1093/biomet/32.2.128 [42] ROMANO, J. P. and WOLF, M. (2007). Control of generalized error rates in multiple testing. Ann. Statist. 35 1378-1408. · Zbl 1127.62063 · doi:10.1214/009053606000001622 [43] ROUDER, J. N., SPECKMAN, P. L., SUN, D. and MOREY, R. D. (2009). Bayesian \(t\) tests for accepting and rejecting the null hypothesis. Psychon. Bull. Rev. 16 225-237. [44] SARKAR, S. K. (2011). Simes’ test in multiple testing. In International Encyclopedia of Statistical Science (M. Lovric, ed.) 1325-1327. Springer, Berlin. [45] Schervish, M. J. (1995). Theory of Statistics. Springer Series in Statistics. Springer, New York. · Zbl 0834.62002 · doi:10.1007/978-1-4612-4250-5 [46] Sellke, T., Bayarri, M. J. and Berger, J. O. (2001). Calibration of \(p\) values for testing precise null hypotheses. Amer. Statist. 55 62-71. · doi:10.1198/000313001300339950 [47] Shafer, G. (1976). A Mathematical Theory of Evidence. Princeton Univ. Press, Princeton, NJ. · Zbl 0359.62002 [48] SHAFER, G. (2007). From Cournot’s principle to market efficiency. In Augustin Cournot: Modelling Economics (J.-P. Touffut, ed.) 55-95. Edward Elgar, Cheltenham. [49] SHAFER, G. (2021). The language of betting as a strategy for statistical and scientific communication (with discussion). J. R. Stat. Soc., \(A 184 407-478\). [50] SHAFER, G. (2022). Bayesian, fiducial, frequentist. In Handbook on Bayesian, Fiducial and Frequentist (BFF) Inferences (J. Berger, X.-L. Meng, N. Reid and M. Xie, eds.) CRC Press, Boca Raton. (to appear). [51] Shafer, G. and Vovk, V. (2019). Game-Theoretic Foundations for Probability and Finance. Wiley, Hoboken, NJ. · Zbl 1422.91026 [52] Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73 751-754. · Zbl 0613.62067 · doi:10.1093/biomet/73.3.751 [53] STOREY, J. D., DAI, J. Y. and LEEK, J. T. (2007). The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments. Biostatistics 8 414-432. · Zbl 1213.62175 [54] Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100 9440-9445. · Zbl 1130.62385 · doi:10.1073/pnas.1530509100 [55] STOREY, J. D., BASS, A. J., DABNEY, A. and ROBINSON, D. (2019). \( \mathtt{qvalue} \): Q-value estimation for false discovery rate control. R package version 2.18.0, available on Bioconductor. [56] STUART, A., ORD, J. K. and ARNOLD, S. (1999). Kendall’s Advanced Theory of Statistics. Vol. \(2A\): Classical Inference and the Linear Model, 6th ed. Arnold, London. [57] TIAN, J., CHEN, X., KATSEVICH, E., GOEMAN, J. and RAMDAS, A. (2021). Large-scale simultaneous inference under dependence. Technical Report. Scandinavian Journal of Statistics. Available at arXiv:2102.11253. [58] Vovk, V., Gammerman, A. and Shafer, G. (2005). Algorithmic Learning in a Random World. Springer, New York. · Zbl 1105.68052 [59] Vovk, V. G. and V’yugin, V. V. (1993). On the empirical validity of the Bayesian method. J. Roy. Statist. Soc. Ser. B 55 253-266. · Zbl 0788.62006 [60] VOVK, V. and WANG, R. (2020a). Combining \(p\)-values via averaging. Biometrika 107 791-808. · Zbl 1457.62078 · doi:10.1093/biomet/asaa027 [61] VOVK, V. and WANG, R. (2020b). True and false discoveries with independent e-values. Technical Report. Available at arXiv:2003.00593 [stat.ME]. [62] VOVK, V. and WANG, R. (2021). E-values: Calibration, combination and applications. Ann. Statist. 49 1736-1754. · Zbl 1475.62087 · doi:10.1214/20-AOS2020 [63] VOVK, V., WANG, B. and WANG, R. (2022). Admissible ways of merging p-values under arbitrary dependence. Ann. Statist. 50 351-375. · Zbl 1486.62057 · doi:10.1214/21-aos2109 [64] Wald, A. (1950). Statistical Decision Functions. Wiley, New York. · Zbl 0040.36402 [65] WANG, M. and LIU, G. (2016). A simple two-sample Bayesian \(t\)-test for hypothesis testing. Amer. Statist. 70 195-201. · Zbl 07665872 · doi:10.1080/00031305.2015.1093027 [66] WANG, R. and RAMDAS, A. (2022). False discovery rate control with e-values. J. R. Stat. Soc. Ser. B. Stat. Methodol. 84 822-852. · Zbl 07909597 [67] WIGGINS, G. A. R., WALKER, L. C. and PEARSON, J. F. (2020). Genome-wide gene expression analyses of BRCA1- and BRCA2-associated breast and ovarian tumours. Cancers 12 3015. [68] WILSON, D. J. (2019). The harmonic mean \(p\)-value for combining dependent tests. Proc. Natl. Acad. Sci. USA 116 1195-1200 · Zbl 1416.62303 · doi:10.1073/pnas.1814092116 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.