Asymptotic optimality of the Westfall-Young permutation procedure for multiple testing under dependence. (English) Zbl 1246.62124

Summary: Test statistics are often strongly dependent in large-scale multiple testing applications. Most corrections for multiplicity are unduly conservative for correlated test statistics, resulting in a loss of power to detect true positives. We show that the P.H. Westfall and S.S. Young [see: Resampling-based multiple testing; examples and methods for p-value adjustment. NY: Wiley (1992; Zbl 0850.62368)] permutation method has asymptotically optimal power for a broad class of testing problems with a block-dependence and sparsity structure among the tests, when the number of tests tends to infinity.


62G10 Nonparametric hypothesis testing
62J15 Paired and multiple comparisons; multiple testing


Zbl 0850.62368
Full Text: DOI arXiv Euclid


[1] Becker, T. and Knapp, M. (2004). A powerful strategy to account for multiple testing in the context of haplotype analysis. The American Journal of Human Genetics 75 561-570.
[2] Benjamini, Y., Krieger, A. M. and Yekutieli, D. (2006). Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93 491-507. · Zbl 1108.62069 · doi:10.1093/biomet/93.3.491
[3] Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165-1188. · Zbl 1041.62061 · doi:10.1214/aos/1013699998
[4] Blanchard, G. and Roquain, É. (2009). Adaptive false discovery rate control under independence and dependence. J. Mach. Learn. Res. 10 2837-2871. · Zbl 1235.62093
[5] Bond, G. L., Hu, W. and Levine, A. (2005). A single nucleotide polymorphism in the MDM2 gene: From a molecular and cellular explanation to clinical effect. Cancer Research 65 5481-5484.
[6] Cheung, V. G., Spielman, R. S., Ewens, K. G., Weber, T. M., Morley, M. and Burdick, J. T. (2005). Mapping determinants of human gene expression by regional and genome-wide association. Nature 437 1365-1369.
[7] Clarke, S. and Hall, P. (2009). Robustness of multiple testing procedures against dependence. Ann. Statist. 37 332-358. · Zbl 1155.62031 · doi:10.1214/07-AOS557
[8] Dudoit, S., Shaffer, J. P. and Boldrick, J. C. (2003). Multiple hypothesis testing in microarray experiments. Statist. Sci. 18 71-103. · Zbl 1048.62099 · doi:10.1214/ss/1056397487
[9] Dudoit, S. and van der Laan, M. J. (2008). Multiple Testing Procedures with Applications to Genomics . Springer, New York. · Zbl 1261.62014
[10] Efron, B. (2007). Correlation and large-scale simultaneous significance testing. J. Amer. Statist. Assoc. 102 93-103. · Zbl 1284.62340
[11] Ge, Y., Dudoit, S. and Speed, T. P. (2003). Resampling-based multiple testing for microarray data analysis. Test 12 1-77. · Zbl 1056.62117 · doi:10.1007/BF02595811
[12] Genovese, C. R., Roeder, K. and Wasserman, L. (2006). False discovery control with p -value weighting. Biometrika 93 509-524. · Zbl 1108.62070 · doi:10.1093/biomet/93.3.509
[13] Goeman, J. J. and Solari, A. (2010). The sequential rejection principle of familywise error control. Ann. Statist. 38 3782-3810. · Zbl 1204.62140 · doi:10.1214/10-AOS829
[14] Good, P. I. (2011). Permutation tests. In Analyzing the Large Number of Variables in Biomedical and Satellite Imagery 5-20. Wiley, Hoboken, NJ.
[15] Goode, E. L., Dunning, A. M., Kuschel, B., Healey, C. S., Day, N. E., Ponder, B. A. J., Easton, D. F. and Pharoah, P. P. D. (2002). Effect of germ-line genetic variation on breast cancer survival in a population-based study. Cancer Research 62 3052-3057.
[16] Hall, P. and Jin, J. (2008). Properties of higher criticism under strong dependence. Ann. Statist. 36 381-402. · Zbl 1139.62049 · doi:10.1214/009053607000000767
[17] Hall, P. and Jin, J. (2010). Innovated higher criticism for detecting sparse signals in correlated noise. Ann. Statist. 38 1686-1732. · Zbl 1189.62080 · doi:10.1214/09-AOS764
[18] Hirschhorn, J. N. and Daly, M. J. (2005). Genome-wide association studies for common diseases and complex traits. Nature Reviews Genetics 6 95-108.
[19] Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6 65-70. · Zbl 0402.62058
[20] Kruglyak, L. (1999). Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genetics 22 139-144.
[21] Liang, C.-L., Rice, J. A., de Pater, I., Alcock, C., Axelrod, T., Wang, A. and Marshall, S. (2004). Statistical methods for detecting stellar occultations by Kuiper belt objects: The Taiwanese-American occultation survey. Statist. Sci. 19 265-274. · Zbl 1100.85501 · doi:10.1214/088342304000000378
[22] Ludbrook, J. and Dudley, H. (1998). Why permutation tests are superior to t and F tests in biomedical research. Amer. Statist. 52 127-132.
[23] Marchini, J., Donnelly, P. and Cardon, L. R. (2005). Genome-wide strategies for detecting multiple loci that influence complex diseases. Nature Genetics 37 413-417.
[24] McCarthy, M. I., Abecasis, G. R., Cardon, L. R., Goldstein, D. B., Little, J., Ioannidis, J. P. A. and Hirschhorn, J. N. (2008). Genome-wide association studies for complex traits: Consensus, uncertainty and challenges. Nature Reviews Genetics 9 356-369.
[25] Meinshausen, N. (2006). False discovery control for multiple tests of association under general dependence. Scand. J. Stat. 33 227-237. · Zbl 1125.62077 · doi:10.1111/j.1467-9469.2005.00488.x
[26] Meinshausen, N. and Rice, J. (2006). Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses. Ann. Statist. 34 373-393. · Zbl 1091.62059 · doi:10.1214/009053605000000741
[27] Reiner, A., Yekutieli, D. and Benjamini, Y. (2003). Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19 368-375.
[28] Roeder, K. and Wasserman, L. (2009). Genome-wide significance levels and weighted hypothesis testing. Statist. Sci. 24 398-413. · Zbl 1329.62435
[29] Romano, J. P. and Wolf, M. (2005). Exact and approximate stepdown methods for multiple hypothesis testing. J. Amer. Statist. Assoc. 100 94-108. · Zbl 1117.62416 · doi:10.1198/016214504000000539
[30] Sun, W. and Cai, T. T. (2009). Large-scale multiple testing under dependence. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 393-424. · Zbl 1248.62005
[31] Westfall, P. H. and Troendle, J. F. (2008). Multiple testing with minimal assumptions. Biom. J. 50 745-755.
[32] Westfall, P. H. and Young, S. S. (1989). p -value adjustments for multiple tests in multivariate binomial models. J. Amer. Statist. Assoc. 84 780-786.
[33] Westfall, P. H. and Young, S. S. (1993). Resampling-Based Multiple Testing : Examples and Methods for p-Value Adjustment . Wiley, New York. · Zbl 0850.62368
[34] Westfall, P. H., Zaykin, D. V. and Young, S. S. (2002). Multiple tests for genetic effects in association studies. In Biostatistical Methods : Methods in Molecular Biology (S. Looney, ed.) 184 143-168. Humana Press, Totawa, NJ.
[35] Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin 1 80-83.
[36] Winkelmann, J., Schormair, B., Lichtner, P., Ripke, S., Xiong, L., Jalilzadeh, S., Fulda, S., Pütz, B., Eckstein, G. and Hauk, S. et al. (2007). Genome-wide association study of restless legs syndrome identifies common variants in three genomic regions. Nature Genetics 39 1000-1006.
[37] Yekutieli, D. and Benjamini, Y. (1999). Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J. Statist. Plann. Inference 82 171-196. · Zbl 1063.62563 · doi:10.1016/S0378-3758(99)00041-5
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.