×

QuickMMCTest: quick multiple Monte Carlo testing. (English) Zbl 06737699

Summary: Multiple hypothesis testing is widely used to evaluate scientific studies involving statistical tests. However, for many of these tests, \(p\) values are not available and are thus often approximated using Monte Carlo tests such as permutation tests or bootstrap tests. This article presents a simple algorithm based on Thompson Sampling to test multiple hypotheses. It works with arbitrary multiple testing procedures, in particular with step-up and step-down procedures. Its main feature is to sequentially allocate Monte Carlo effort, generating more Monte Carlo samples for tests whose decisions are so far less certain. A simulation study demonstrates that for a low computational effort, the new approach yields a higher power and a higher degree of reproducibility of its results than previously suggested methods.

MSC:

62-XX Statistics
PDF BibTeX XML Cite
Full Text: DOI arXiv Link

References:

[1] Agrawal, S., and Goyal, N.: Analysis of Thompson Sampling for the Multi-armed Bandit Problem. JMLR: Workshop and Conference Proceedings of the 25th Annual Conference on Learning Theory, 23(39), 1-26 (2012) · Zbl 1437.62623
[2] Asomaning, N; Archer, K, High-throughput DNA methylation datasets for evaluating false discovery rate methodologies, Comput. Stat. Data Anal., 56, 1748-1756, (2012)
[3] Benjamini, Y; Hochberg, Y, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J. R. Stat. Soc. Ser. B, 57, 289-300, (1995) · Zbl 0809.62014
[4] Benjamini, Y; Yekutieli, D, The control of the false discovery rate in multiple testing under dependency, Ann. Stat., 29, 1165-1188, (2001) · Zbl 1041.62061
[5] Besag, J; Clifford, P, Sequential Monte Carlo p values, Biometrika, 78, 301-304, (1991)
[6] Bonferroni, C, Teoria statistica delle classi e calcolo delle probabilità, Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze, 8, 3-62, (1936) · Zbl 0016.41103
[7] Davison, A., Hinkley, D.: Bootstrap Methods and Their Application. Cambridge University Press, Cambridge (1997) · Zbl 0886.62001
[8] Dazard, J-E; Rao, S, Joint adaptive Mean variance regularization and variance stabilization of high dimensional data, Comput. Stat. Data Anal., 56, 2317-2333, (2012) · Zbl 1252.62036
[9] Edgington, E., Onghena, P.: Randomization Tests, 4th edn. Chapman & Hall/CRC, Boca Raton (1997) · Zbl 1291.62009
[10] Gandy, A; Hahn, G, Mmctest—a safe algorithm for implementing multiple Monte Carlo tests, Scand. J. Stat., 41, 1083-1101, (2014) · Zbl 1305.62270
[11] Gleser, L, Comment on ’bootstrap confidence intervals’ by T, J. DiCiccio B. Efron. Stat. Sci., 11, 219-221, (1996)
[12] Guo, W; Peddada, S, Adaptive choice of the number of bootstrap samples in large scale multiple testing, Stat. Appl. Genet. Mol. Biol., 7, 1-16, (2008) · Zbl 1276.62072
[13] Gusenleitner, D; Howe, E; Bentink, S; Quackenbush, J; Culhane, A, Ibbig: iterative binary bi-clustering of gene sets, Bioinformatics, 28, 2484-2492, (2012)
[14] Hochberg, Y, A sharper bonferroni procedure for multiple tests of significance, Biometrika, 75, 800-802, (1988) · Zbl 0661.62067
[15] Holm, S, A simple sequentially rejective multiple test procedure, Scand. J. Stat., 6, 65-70, (1979) · Zbl 0402.62058
[16] Jiang, H; Salzman, J, Statistical properties of an early stopping rule for resampling-based multiple testing, Biometrika, 99, 973-980, (2012) · Zbl 1452.62557
[17] Li, G; Best, N; Hansell, A; Ahmed, I; Richardson, S, Baystdetect: detecting unusual temporal patterns in small area data via Bayesian model choice, Biostatistics, 13, 695-710, (2012)
[18] Liu, J; Chen, R, Sequential Monte Carlo methods for dynamic systems, J. Am. Stat. Assoc., 93, 1032-1044, (1998) · Zbl 1064.65500
[19] Liu, J; Huang, J; Ma, S; Wang, K, Incorporating group correlations in genome-wide association studies using smoothed group lasso, Biostatistics, 14, 205-219, (2013)
[20] Lourenco, V; Pires, A, M-regression, false discovery rates and outlier detection with application to genetic association studies, Comput. Stat. Data Anal., 78, 33-42, (2014) · Zbl 06984035
[21] Manly, B.: Randomization, Bootstrap and Monte Carlo Methods in Biology, 2nd edn. Chapman & Hall, London (1997) · Zbl 0918.62081
[22] Martínez-Camblor, P, On correlated z-values distribution in hypothesis testing, Comput. Stat. Data Anal., 79, 30-43, (2014) · Zbl 06984053
[23] Nusinow, D; Kiezun, A; O’Connell, D; Chick, J; Yue, Y; Maas, R; Gygi, S; Sunyaev, S, Network-based inference from complex proteomic mixtures using SNIPE, Bioinformatics, 28, 3115-3122, (2012)
[24] Pekowska, A; Benoukraf, T; Ferrier, P; Spicuglia, S, A unique h3k4me2 profile marks tissue-specific gene regulation, Genome Res., 20, 1493-1502, (2010)
[25] Pounds, S., Cheng, C.: Robust estimation of the false discovery rate. Bioinformatics 22(16), 1979-1987 (2006)
[26] Rahmatallah, Y; Emmert-Streib, F; Glazko, G, Gene set analysis for self-contained tests: complex null and specific alternative hypotheses, Bioinformatics, 28, 3073-3080, (2012)
[27] Rom, D, A sequentially rejective test procedure based on a modified bonferroni inequality, Biometrika, 77, 663-665, (1990)
[28] Sandve, G; Ferkingstad, E; Nygård, S, Sequential Monte Carlo multiple testing, Bioinformatics, 27, 3235-3241, (2011)
[29] Shaffer, J, Modified sequentially rejective multiple test procedures, J. Am. Stat.Assoc., 81, 826-831, (1986) · Zbl 0603.62087
[30] Sidak, Z, Rectangular confidence regions for the means of multivariate normal distributions, J. Am. Stat.Assoc., 62, 626-633, (1967) · Zbl 0158.17705
[31] Simes, R, An improved bonferroni procedure for multiple tests of significance, Biometrika, 73, 751-754, (1986) · Zbl 0613.62067
[32] Tamhane, A; Liu, L, On weighted hochberg procedures, Biometrika, 95, 279-294, (2008) · Zbl 1437.62623
[33] Thompson, W, On the likelihood that one unknown probability exceeds another in view of the evidence of two samples, Biometrika, 25, 285-294, (1933) · JFM 59.1159.03
[34] Wu, H; Wang, C; Wu, Z, A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data, Biostatistics, 14, 232-243, (2013)
[35] Zhou, Y-H; Barry, W; Wright, F, Empirical pathway analysis, without permutation, Biostatistics, 14, 573-585, (2013)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.