zbMATH — the first resource for mathematics

Robust tests in genome-wide scans under incomplete linkage disequilibrium. (English) Zbl 1329.62440
Summary: Under complete linkage disequilibrium (LD), robust tests often have greater power than Pearson’s chi-square test and trend tests for the analysis of case-control genetic association studies. Robust statistics have been used in candidate-gene and genome-wide association studies (GWAS) when the genetic model is unknown. We consider here a more general incomplete LD model, and examine the impact of penetrances at the marker locus when the genetic models are defined at the disease locus. Robust statistics are then reviewed and their efficiency and robustness are compared through simulations in GWAS of 300,000 markers under the incomplete LD model. Applications of several robust tests to the Wellcome Trust Case-Control Consortium [“Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls”, Nature 447, 661–678 (2007)] are presented.

62P10 Applications of statistics to biology and medical sciences; meta analysis
62G10 Nonparametric hypothesis testing
92D10 Genetics and epigenetics
62G35 Nonparametric robustness
Full Text: DOI Euclid
[1] Andrews, D. F., Bickel, P. J., Hampel, F. R., Huber, P. J., Rogers, W. H. and Tukey, J. W. (1965). Robust Estimation of Location . Princeton Univ. Press, Princeton, NJ. · Zbl 0254.62001
[2] Balding, D. J. (2006). A tutorial on statistical methods for population association studies. Nat. Rev. Genet. 7 781-791.
[3] Conneely, K. N. and Boehnke, M. (2007). So many correlated tests, so little time! Rapid adjustment of P values for multiple correlated tests. Am. J. Hum. Genet. 81 1158-1168.
[4] Davies, R. B. (1977). Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64 247-254. JSTOR: · Zbl 0362.62026
[5] Davies, R. B. (1987). Hypothesis-testing when a nuisance parameter is present only under the alternative. Biometrika 74 33-43. JSTOR: · Zbl 0612.62023
[6] Elston R. C., Lin, D. Y. and Zheng, G. (2007). Multistage sampling for genetic studies. Ann. Rev. Gen. Hum. Genet. 8 327-342.
[7] Freidlin, B., Zheng, G., Li, Z. and Gastwirth, J. L. (2002). Trend tests for case-control studies of genetic markers: Power, sample size and robustness. Hum. Hered. 53 146-152 (Erratum 68 (2009) 220).
[8] Gail. M. H., Pfeiffer. R. M., Wheeler. W. and Pee, D. (2008). Probability of detecting disease-associated single nucleotide polymorphisms in case-control genome-wide association studies. Biostatistics 9 201-215. · Zbl 1143.62348
[9] Gastwirth, J. L. (1966). On robust procedures. J. Amer. Statist. Assoc. 61 929-948. JSTOR: · Zbl 0144.19004
[10] Gastwirth, J. L. (1985). The use of maximin efficiency robust tests in combining contingency tables and survival analysis. J. Amer. Statist. Assoc. 80 380-384. · Zbl 0573.62042
[11] González, J. R., Carrasco, J. L., Dudbridge, F., Armengol, L., Estivill, X. and Moreno, V. (2008). Maximizing association statistics over genetic models. Genet. Epidemiol. 32 246-254.
[12] Guedj, M., Nuel, G. and Prum, B. (2008). A note on allelic tests in case-control association studies. Ann. Hum. Genet. 72 407-409.
[13] Hanson, R. L., Looker, H. C., Ma, L., Muller, Y. L., Baier, L. J. and Knowler, W. C. (2006). Design and analysis of genetic association studies to finely map a locus identified by linkage analysis: Sample size and power calculations. Ann. Hum. Genet. 70 332-349.
[14] Hoh, J. and Ott, J. (2003). Mathematical multi-locus approaches to localizing complex human trait genes. Nat. Rev. Genet. 4 701-709.
[15] Joo, J., Kwak, M., Ahn, K. and Zheng, G. (2009). A robust genome-wide scan statistic of the Wellcome Trust Case-Control Consortium. Biometrics 65 1115-1122. · Zbl 1180.62171
[16] Klein, R. J., Zeiss, C., Chew, E. Y., Tsai, J. Y., Sackler, R. S., Haynes, C., Henning, A. K., SanGiovanni, J. P., Mane, S. M., Mayne, S. T., Bracken, M. B., Ferris, F. L., Ott, J., Barnstable, C. and Hoh, J. (2005). Complement factor H polymorphism in aged-related macular degeneration. Science 308 385-389.
[17] Kraft, P., Zeggini, E. and Ioannidis, J. P. A. (2009). Replication in genome-wide association studies. Statist. Sci. 24 561-573. · Zbl 1329.62429
[18] Lettre, G., Lange, C. and Hirschhorn, J. N. (2007). Genetic model testing and statistical power in population-based association studies of quantitative traits. Genet. Epidemiol. 31 358-362.
[19] Lewontin, R. C. (1964). The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49 49-67.
[20] Li, Q., Zheng, G., Li, Z. and Yu, K. (2008a). Efficient approximation of P-value of the maximum of correlated tests, with applications to genome-wide association studies. Ann. Hum. Genet. 72 397-406.
[21] Li, Q., Yu, K., Li, Z. and Zheng, G. (2008b). MAX-rank: A simple and robust genome-wide scan for case-control association studies. Hum. Genet. 123 617-623.
[22] Marchini, J., Donnelly, P. and Cardon, L. R. (2005). Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat. Genet. 37 413-417.
[23] Nielsen, D. M., Ehm, M. G. and Weir, B. S. (1998). Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am. J. Hum. Genet. 63 1531-1540.
[24] Nielsen, D. M. and Weir, B. S. (1999). A classical setting for associations between markers and loci affecting quantitative traits. Genet. Res. 74 271-277.
[25] Pfeiffer, R. M., Gail, M. H. and Pee, D. (2009). On combining data from genome-wide association studies to discover disease-associated SNPs. Statist. Sci. 24 547-560. · Zbl 1329.62434
[26] Roeder, K. and Wasserman, L. (2009). Genome-wide significance levels and weighted hypothesis testing. Statist. Sci. 24 398-413. · Zbl 1329.62435
[27] Sasieni, P. D. (1997). From genotypes to genes: Doubling the sample size. Biometrics 53 1253-1261. JSTOR: · Zbl 0931.62099
[28] Schaid, D. J., McDonnell, S. K., Hebbring, S. J., Cunningham, J. M. and Thibodeau, S. N. (2005). Nonparametric tests of association of multiple genes with human diseases. Am. J. Hum. Genet. 76 780-793.
[29] Sladek, R., Rocheleau, G., Rung, J., Dina, C., Shen, L. Serre, D., Boutin, P., Vincent, D., Belisle, A., Hadjadj, S., Balkau, B., Heude, B. et al. (2007). A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445 881-885.
[30] Song, K. and Elston, R. C. (2006). A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies. Stat. Med. 25 105-126.
[31] Thomas, D. C., Casey, G., Conti, D., Haile, R. W., Lewinger, J. P. and Stram, D. O. (2009). Methodological issues in multistage genome-wide association studies. Statist. Sci. 24 414-429. · Zbl 1329.62439
[32] Tukey, J. W. (1965). Which part of the sample contains the information? Proc. Natl. Acad. Sci. USA 53 127-134. JSTOR: · Zbl 0168.40205
[33] Van Steen, K., McQueen, M. B., Herbert, A., Raby, B., Lyon, H., DeMeo, D. L., Murphy, A., Su, J., Datta, S., Rosenow, C., Christman, M., Silverman, E. K., Laird, N. M., Weiss, S. T. and Lange, C. (2005). Genomic screening and replication using the same data set in family-based association testing. Nat. Genet. 37 683-691.
[34] Wang, K. and Sheffield, V. C. (2005). A constrained-likelihood approach to marker-trait association studies. Am. J. Hum. Genet. 77 768-780.
[35] Weir, B. S. (1996). Genetic Data Analysis II: Methods for Discrete Population Genetic Data . Sinauer, Sunderland, MA.
[36] Wittke-Thompson, J. K., Pluzhnikov, A. and Cox, N. J. (2005). Rational inferences about departure from Hardy-Weinberg equilibrium. Am. J. Hum. Genet. 76 967-986.
[37] The Wellcome Trust Case Control Consortium (WTCCC) (2007). Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447 661-678.
[38] Yamada, R. and Okada, Y. (2009). An optimal dose-effect mode trend test for SNP genotype tables. Genet. Epidemiol. 33 114-127.
[39] Zang, Y., Fung, W. K. and Zheng, G. (2010). Simple algorithms to calculate asymptotic null distributions for robust tests in case-control genetic association studies in R. J. Stat. Software 33 1-24.
[40] Zaykin, D. V. and Nielsen, D. M. (2000). Hardy-Weinberg disequilibrium (HWD) fine mapping for case-control samples. Am. J. Hum. Genet. 67 1238 Suppl.
[41] Zaykin, D. V. and Zhivotovsky, L. A. (2005). Ranks of genuine associations in whole-genome scans. Genetics 171 813-823.
[42] Zheng, G. and Ng, H. K. T. (2008). Genetic model selection in two-phase analysis for case-control association studies. Biostatistics 9 391-399. · Zbl 1143.62088
[43] Zheng, G., Joo, J. and Yang, Y. (2009). Pearson’s test, trend tests and MAX are all trend tests with different types of scores. Ann. Hum. Genet. 73 133-140.
[44] Zheng, G., Song, K. and Elston, R. C. (2007). Adaptive two-stage analysis of genetic association in case-control designs. Hum. Hered. 63 175-186.
[45] Zheng, G., Meyer, M., Li, W. and Yang, Y. (2008). Comparison of two-phase analyses for case-control genetic association studies. Stat. Med. 27 5054-5075.
[46] Zheng, G., Joo, J., Tian, X., Wu, C. O., Lin, J.-P., Stylianou, M., Waclawtw, M. A. and Geller N. L. (2009). Robust genome-wide scans with genetic model selection using case-control design. Stat. Interface (A special issue in honor of Joseph Gastwirth) 2 145-151. · Zbl 1245.62167
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.