zbMATH — the first resource for mathematics

AGGrEGATOr: A Gene-based GEne-Gene interActTiOn test for case-control association studies. (English) Zbl 1343.92010
Summary: Among the large of number of statistical methods that have been proposed to identify gene-gene interactions in case-control genome-wide association studies (GWAS), gene-based methods have recently grown in popularity as they confer advantage in both statistical power and biological interpretation. All of the gene-based methods jointly model the distribution of single nucleotide polymorphisms (SNPs) sets prior to the statistical test, leading to a limited power to detect sums of SNP-SNP signals. In this paper, we instead propose a gene-based method that first performs SNP-SNP interaction tests before aggregating the obtained \(p\)-values into a test at the gene level. Our method called AGGrEGATOr is based on a minP procedure that tests the significance of the minimum of a set of \(p\)-values. We use simulations to assess the capacity of AGGrEGATOr to correctly control for type-I error. The benefits of our approach in terms of statistical power and robustness to SNPs set characteristics are evaluated in a wide range of disease models by comparing it to previous methods. We also apply our method to detect gene pairs associated to rheumatoid arthritis (RA) on the GSE39428 dataset. We identify 13 potential gene-gene interactions and replicate one gene pair in the Wellcome Trust Case Control Consortium dataset at the level of 5%. We further test 15 gene pairs, previously reported as being statistically associated with RA or Crohn’s disease (CD) or coronary artery disease (CAD), for replication in the Wellcome Trust Case Control Consortium dataset. We show that AGGrEGATOr is the only method able to successfully replicate seven gene pairs.
92B15 General biostatistics
62P10 Applications of statistics to biology and medical sciences; meta analysis
92D10 Genetics and epigenetics
Full Text: DOI
[1] Babron, M.-C., A. Etcheto and M.-H. Dizier (2015): “A new correction for multiple testing in gene-gene interaction studies,” Ann. Hum. Genet., doi: 10.1111/ahg.12113.
[2] Chang, X., R. Yamada, A. Suzuki, T. Sawada, S. Yoshino, S. Tokuhiro and K. Yamamoto (2005): “Localization of peptidylarginine deiminase 4 (padi4) and citrullinated protein in synovial tissue of rheumatoid arthritis,” Rheumatology, 44, 40-50.
[3] Chang, X., Y. Zheng, Q. Yang, L. Wang, J. Pan, Y. Xia, X. Yan and J. Han (2012): “Carbonic anhydrase i (ca1) is involved in the process of bone formation and is susceptible to ankylosing spondylitis,” Arthritis Res. Ther., 14, R176.
[4] Chang, X., B. Xu, L. Wang, Y. Wang, Y. Wang and S. Yan (2013): “Investigating a pathogenic role for txndc5 in tumors,” Int. J. Oncol., 43, 1871-1884.
[5] Cheverud, J. M. (2001): “A simple correction for multiple comparisons in interval mapping genome scans,” Heredity, 87, 52-58.
[6] Conneely, K. N. and M. Boehnke (2007): “So many correlated tests, so little time! Rapid adjustment of p values for multiple correlated tests,” Am. J. Hum. Genet., 81, 1158-1168.
[7] Cordell, H. J. (2002): “Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans,” Hum. Mol. Genet., 11, 2463-2468.
[8] Cordell, H. J. (2009): “Detecting gene-gene interactions that underlie human diseases,” Nat. Rev. Genet., 10, 392-404.
[9] Dong, C., X. Chu, Y. Wang, Y. Wang, L. Jin, T. Shi, W. Huang and Y. Li (2008): “Exploration of gene-gene interaction effects using entropy-based methods,” Eur. J. Hum. Genet., 16, 229-235.
[10] Emily, M. (2012): “Indor: a new statistical procedure to test for snp x snp epistasis in genome-wide association studies,” Stat. Med., 31, 2359-2373.
[11] Emily, M., T. Mailund, J. Hein, L. Schauser and M. H. Schierup (2009): “Using biological networks to search for interacting loci in genome-wide association studies,” Eur. J. Hum. Genet., 17, 1231-1240.
[12] Excoffier, L. and M. Slatkin (1995): “Maximum likelihood estimation of molecular haplotype frequencies in a diploid population,” Mol. Biol. Evol., 12, 921-927.
[13] Galwey, N. W. (2009): “A new measure of the effective number of tests, a practical tool for comparing families of non-independent significance tests,” Genet. Epidemiol., 33, 559-568.
[14] Gao, X., J. Starmer and E. R. Martin (2008): “A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms,” Genet. Epidemiol., 32, 361-369.
[15] Genz, A. and F. Bretz (2009): Computation of multivariate normal and T probabilities, 1st ed., New York: Springer-Verlag. · Zbl 1204.62088
[16] Goodarzi, M. O., Y. V. Louwers, K. D. Taylor, M. R. Jones, J. Cui, S. Kwon, Y.-D. I. Chen, X. Guo, L. Stolk, A. G. Uitterlinden, J. S. Laven and R. Azziz (2011): “Replication of association of a novel insulin receptor gene polymorphism with polycystic ovary syndrome,” Fertil. Steril., 95, 1736-1741.
[17] Han, S., B.-Z. Yang, H. R. Kranzler, X. Liu, H. Zhao, L. A. Farrer, E. Boer-winkle, J. B. Potash and J. Gelernter (2013): “Integrating gwass and human protein interaction networks identifies a gene subnetwork underlying alcohol dependence,” Am. J. Hum. Genet., 93, 1027-1034.
[18] Hendricks, A. E., J. Dupuis, M. W. Logue, R. H. Myers and K. L. Lunetta (2014): “Correction for multiple testing in a gene region,” Eur. J. Hum. Genet., 22, 414-418.
[19] Hill, W. G. and A. Robertson (1968): “Linkage diseqilibrium in finite populations,” Theor. Appl. Genet., 38, 226-231.
[20] Hindorff, L. A., P. Sethupathy, H. A. Junkins, E. M. Ramos, J. P. Mehta, F. S. Collins and T. A. Manolio (2009): “Potential etiologic and functional implications of genome-wide association loci for human diseases and traits,” Proc. Natl. Acad. Sci. USA, 106, 9362-9367.
[21] Howie, B. N., P. Donnelly and J. Marchini (2009): “A flexible and accurate genotype imputation method for the next generation of genome-wide association studies,” PLoS Genet., 5, e1000529.
[22] Huang, H., P. Chanda, A. Alonso, J. S. Bader and D. E. Arking (2011): “Gene-based tests of association,” PLoS Genet., 7, e1002177.
[23] Iwamoto, T., K. Ikari, T. Nakamura, M. Kuwahara, Y. Toyama, T. Tomatsu, S. Mo-mohara and N. Kamatani (2006): “Association between padi4 and rheumatoid arthritis: a meta-analysis,” Rheumatology, 45, 804-807.
[24] Jiang, B., X. Zhang, Y. Zuo and G. Kang (2011): “A powerful truncated tail strength method for testing multiple null hypotheses in one dataset,” J. Theor. Biol., 277, 67-73.
[25] Jorgenson, E. and J. S. Witte (2006): “A gene-centric approach to genome-wide association studies,” Nat. Rev. Genet., 7, 885-891.
[26] Jung, J., J. J. Song and D. Kwon (2009): “Allelic based gene-gene interactions in rheumatoid arthritis,” BMC Proc., S7, S76.
[27] Kang, G., W. Yue, J. Zhang, Y. Cui, Y. Zuo and D. Zhang (2008): “An entropy-based approach for testing genetic epistasis underlying complex diseases,” J. Theor. Biol., 250, 362-374.
[28] Keshava Prasad, T. S., R. Goel, K. Kandasamy, S. Keerthikumar, S. Kumar, S. Mathivanan, D. Telikicherla, R. Raju, B. Shafreen, A. Venugopal, L. Balakrishnan, A. Marimuthu, S. Banerjee, D. S. Somanathan, A. Sebastian, S. Rani, S. Ray, C. J. Harrys Kishore, S. Kanth, M. Ahmed, M. K. Kashyap, R. Mohmood, Y. L. Ramachandra, V. Krishna, B. A. Rahiman, S. Mohan, P. Ranganathan, S. Ramabadran, R. Chaerkady and A. Pandey (2009): “Human protein reference database,” Nuc. Acids Res., 37, D767-D772.
[29] Larson, N. B. and D. J. Schaid (2013): “A kernel regression approach to gene-gene interaction detection for case-control studies,” Genet. Epidemiol., 37, 695-703.
[30] Larson, N. B., G. D. Jenkins, M. C. Larson, R. A. Vierkant, T. A. Sellers, C. M. Phelan, J. M. Schildkraut, R. Sutphen, P. P. D. Pharoah, S. A. Gayther, N. Wentzensen, E. L. Goode and B. L. Fridley (2014): “Kernel canonical correlation analysis for assessing gene-gene interactions and application to ovarian cancer,” Eur. J. Hum. Genet., 22, 126-131.
[31] Lewis, C. M. (2002): “Genetic association studies: design, analysis and interpretation,” Brief. Bioinform., 3, 146-153.
[32] Li, W. and J. Reich (2000): “A complete enumeration and classification of two-locus disease models,” Hum. Hered., 50, 334-349.
[33] Li, J. and L. Ji (2005): “Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix,” Heredity, 95, 221-227.
[34] Li, J. and Y. Chen (2008): “Generating samples for association studies based on hapmap data,” BMC Bioinformatics, 9, 44.
[35] Li, J., R. Tang, J. Biernacka and M. de Andrade (2009): “Identification of gene-gene interaction using principal components,” BMC Proceedings, 3, S78.
[36] Li, M.-X., H.-S. Gui, J. Kwan and P. Sham (2011): “Gates: a rapid and powerful gene-based association test using extended simes procedure,” Am. J. Hum. Genet., 88, 283-293.
[37] Li, J., D. Huang, M. Guo, X. Liu, C. Wang, Z. Teng, R. Zhang, Y. Jiang, H. Lv and L. Wang (2015): “A gene-based information gain method for detecting gene-gene interactions in case-control studies,” Eur. J. Hum. Genet., 23, 1566-1572.
[38] Liu, J. Z., A. F. Mcrae, D. R. Nyholt, S. E. Medland, N. R. Wray, K. M. Brown, N. K. Hayward, G. W. Montgomery, P. M. Visscher, N. G. Martin and S. Mac-gregor (2010): “A versatile gene-based test for genome-wide association studies,” Am. J. Hum. Genet., 87, 139-145.
[39] Liu, Y., H. Xu, S. Chen, X. Chen, Z. Zhang, Z. Zhu, X. Qin, L. Hu, J. Zhu, G.-P. Zhao and X. Kong (2011): “Genome-wide interaction-based association analysis identified multiple new susceptibility loci for common diseases,” PLoS Genet., 7, e1001338.
[40] Ma, L., A. G. Clark and A. Keinan (2013): “Gene-based testing of interactions in association studies of quantitative traits,” PLoS Genet., 9, e1003321.
[41] Maher, B. (2008): “Personal genomes: the case of the missing heritability,” Nature, 456, 18-21.
[42] Manolio, T. A., F. S. Collins, N. J. Cox, D. B. Goldstein, L. A. Hindorff, D. J. Hunter, M. I. McCarthy, E. M. Ramos, L. R. Cardon, A. Chakravarti, J. H. Cho, A. E. Guttmacher, A. Kong, L. Kruglyak, E. Mardis, C. N. Rotimi, M. Slatkin, D. Valle, A. S. Whittemore, M. Boehnke, A. G. Clark, E. E. Eichler, G. Gibson, J. L. Haines, T. F. C. Mackay, S. A. McCarroll and P. M. Visscher (2009): “Finding the missing heritability of complex diseases,” Nature, 461, 747-753.
[43] Marchini, J., P. Donnelly and L. R. Cardon (2005): “Genome-wide strategies for detecting multiple loci that influence complex diseases,” Nat. Genet., 37, 413-417.
[44] Montana, G. (2005): “Hapsim: a simulation tool for generating haplotype data with pre-specified allele frequencies and ld coefficients,” Bioinformatics, 21, 4309-4311.
[45] Moore, J. H. (2003): “The ubiquitous nature of epistasis in determining susceptibility to common human diseases,” Hum. Hered., 56, 73-82.
[46] Moore, J. and B. White (2007): “Tuning relieff for genome-wide genetic analysis,” Lect. Notes Comput. Sc., 4447, 166-175.
[47] Musameh, M. D., W. Y. S. Wang, C. P. Nelson, C. Llus-Ganella, R. Debiec, I. Subirana, R. Elosua, A. J. Balmforth, S. G. Ball, A. S. Hall, S. Kathiresan, J. R. Thompson, G. Lucas, N. J. Samani and M. Tomaszewski (2015): “Analysis of gene-gene interactions among common variants in candidate cardiovascular genes in coronary artery disease,” PLoS One, 10, e0117684.
[48] Neale, B. M. and P. C. Sham (2004): “The future of association studies: gene-based analysis and replication,” Am. J. Hum. Genet., 75, 353-362.
[49] Neuman, R. J. and J. P. Rice (1992): “Two-locus models of diseases,” Genet. Epidemiol., 9, 347-365.
[50] Nielsen, D. M., M. G. Ehm, D. V. Zaykin and B. S. Weir (2004): “Effect of and three-locus linkage disequilibrium on the power to detect marker/phenotype associations,” Genetics, 168, 1029-1040.
[51] Nyholt, D. R. (2004): “A simple correction for multiple testing for single nucleotide polymorphisms in linkage disequilibrium with each other,” Am. J. Hum. Genet., 74, 765-769.
[52] Peng, Q., J. Zhao and F. Xue (2010): “A gene-based method for detecting gene co-association in a case-control association study,” Eur. J. Hum. Genet., 18, 582-587.
[53] Phillips, P. (2008): “Epistasis, the essential role of gene interactions in the ture and evolution of genetic systems,” Nat. Rev. Genet., 9, 855-867.
[54] Pritchard, J. K. and M. Przeworski (2001): “Linkage disequilibrium in Models and data,” Am. J. Hum. Genet., 69, 1-14.
[55] Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M. A. R. Ferreira, D. J. Maller, P. Sklar, P. I. W. de Bakker, M. J. Daly and P. C. Sham (2007): “Plink: a toolset for whole-genome association and population-based linkage analysis,” Am. J. Hum. Genet., 81, 559-575.
[56] R Core Team (2013): R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, URL .
[57] Rajapakse, I., M. D. Perlman, P. J. Martin, J. A. Hansen and C. Kooperberg (2012): “Multivariate detection of gene-gene interactions,” Genet. Epidemiol., 36, 622-630.
[58] Ritchie, M. D., L. W. Hahn, N. Roodi, L. R. Bailey, W. D. Dupont, F. F. Parl and J. H. Moore (2001): “Multifactor-dimensionality reduction reveals order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet., 69, 138-147.
[59] Schwarz, D., I. Konig and A. Ziegler (2010): “On safari to random jungle: a implementation of random forests for high dimensional data,” Bioinformatics 26, 1752-1758.
[60] Seaman, S. and B. Mller-Myhsok (2005): “Rapid simulation of p values for product methods and multiple-testing adjustment in association studies,” Am. J. Hum. Genet., 76, 399-408.
[61] The 1000 Genomes Project Consortium, G. (2012): “An integrated map of genetic variation from 1,092 human genomes,” Nature, 491, 56-65.
[62] Ueki, M. and H. J. Cordell (2012): “Improved statistics for genome-wide interaction analysis,” PLoS Genet., 8, e1002625.
[63] Wan, X., C. Yang, Q. Yang, H. Xue, X. Fan, N. L. S. Tang and W. Yu (2010): “Boost: a fast approach to detecting gene-gene interactions in genome-wide case-control studies,” Am. J. Hum. Genet., 87, 325-340.
[64] Weir, B. S. (2008): “Linkage disequilibrium and association mapping,” Annu. Rev. Genom. Hum. G., 9, 129-142.
[65] Wodak, S. J., J. Vlasblom, A. L. Turinsky and S. Pu (2013): “Protein-protein interaction networks: the puzzling riches,” Curr. Opin. Struc. Biol., 23, 941-953.
[66] WTCCC (2007): “Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls,” Nature, 447, 661-678.
[67] Wu, M. C., P. Kraft, M. P. Epstein, D. M. Taylor, S. J. Chanock, D. J. Hunter and X. Lin (2010a): “Powerful snp-set analysis for case-control genome-wide association studies,” Am. J. Hum. Genet., 86, 929-942.
[68] Wu, X., H. Dong, L. Luo, Y. Zhu, G. Peng, J. D. Reveille and M. Xiong (2010b): “A novel statistic for genome-wide interaction analysis,” PLoS Genet., 6, e1001131.
[69] Yuan, Z., Q. Gao, Y. He, X. Zhang, F. Li, J. Zhao and F. Xue (2012): “Detection for gene-gene co-association via kernel canonical correlation analysis,” BMC Genetics, 13, 83.
[70] Zavala-Cerna, M. G., N. G. Gonzalez-Montoya, A. Nava, J. I. Gamez-Nava, M. Moran-Moguel, R. C. Rosales-Gomez, S. A. Gutierrez-Rubio, J. Sanchez-Corona, L. Gonzalez-Lopez, I. P. Davalos-Rodriguez and M. Salazar-Paramo (2013): “Padi4 haplotypes in association with ra mexican patients, a new prospect for antigen modulation,” Clin. Dev. Immunol., 2013. .
[71] Zaykin, D., L. A. Zhivotovsky, P. Westfall and B. Weir (2002): “Truncated product method for combining p-values,” Genet. Epidemiol., 22, 170-185.
[72] Zhang, Y. and J. S. Liu (2007): “Bayesian inference of epistatic interactions in case-control studies,” Nat. Genet., 39, 1167-1173.
[73] Zhang, X., X. Yang, Z. Yuan, Y. Liu, F. Li, B. Peng, D. Zhu, J. Zhao and F. Xue (2013): “A plspm-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design,” PLoS One, 8, e62129.
[74] Zhao, J., L. Jin and M. Xiong (2006): “Test for interaction between two unlinked loci,” Am. J. Hum. Genet., 79, 831-845.
[75] Zheng, Y., L. Wang, W. Zhang, H. Xu, Y. Chang, X., L. Wang, W. Zhang, H. Xu and X. Chang (2012): “Transgenic mice over-expressing carbonic anhydrase I showed aggravated joint inflammation and tissue destruction,” BMC Muscu-loskeletal Disorders, 13, 256.
[76] Zuk, O., E. Hechter, S. R. Sunyaev and E. S. Lander (2012): “The mystery of missing heritability: genetic interactions create phantom heritability,” Proc. Natl. Acad. Sci. USA, 109, 1193-1198.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.