×

Bayesian methods for genetic association analysis with heterogeneous subgroups: from meta-analyses to gene-environment interactions. (English) Zbl 1454.62418

Summary: Genetic association analyses often involve data from multiple potentially-heterogeneous subgroups. The expected amount of heterogeneity can vary from modest (e.g., a typical meta-analysis) to large (e.g., a strong gene-environment interaction). However, existing statistical tools are limited in their ability to address such heterogeneity. Indeed, most genetic association meta-analyses use a “fixed effects” analysis, which assumes no heterogeneity. Here we develop and apply Bayesian association methods to address this problem. These methods are easy to apply (in the simplest case, requiring only a point estimate for the genetic effect and its standard error, from each subgroup) and effectively include standard frequentist meta-analysis methods, including the usual “fixed effects” analysis, as special cases. We apply these tools to two large genetic association studies: one a meta-analysis of genome-wide association studies from the Global Lipids consortium, and the second a cross-population analysis for expression quantitative trait loci (eQTLs). In the Global Lipids data we find, perhaps surprisingly, that effects are generally quite homogeneous across studies. In the eQTL study we find that eQTLs are generally shared among different continental groups, and discuss consequences of this for study design.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F03 Parametric hypothesis testing
62F15 Bayesian inference
92D10 Genetics and epigenetics

Software:

METASOFT; METAL
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Bravata, D. and Olkin, I. (2001). Simple pooling versus combining in meta-analysis. Eval. Health Prof. 24 218-230.
[2] Brown, C., Mangravite, L. M. and Engelhardt, B. E. (2012). Integrative modeling of eQTLs and cis-regulatory elements suggest mechanisms underlying cell type specifcity of eQTLs. Preprint. Available at . 1210.3294
[3] Burgess, S., Thompson, S. G. and Andrews, G. et al. (2010). Bayesian methods for meta-analysis of causal relationships estimated using genetic instrumental variables. Stat. Med. 29 1298-1311. · doi:10.1002/sim.3843
[4] Butler, R. W. and Wood, A. T. A. (2002). Laplace approximations for hypergeometric functions with matrix argument. Ann. Statist. 30 1155-1177. · Zbl 1029.62047 · doi:10.1214/aos/1031689021
[5] De Iorio, M., Newcombe, P. J., Tachmazidou, I., Verzilli, C. J. and Whittaker, J. C. (2011). Bayesian semiparametric meta-analysis for genetic association studies. Genet. Epidemiol. 35 333-340.
[6] Dimas, A. S., Deutsch, S., Stranger, B. E., Montgomery, S. B., Borel, C. et al. (2009). Common regulatory variation impacts gene expression in a cell type-dependent manner. Science 325 1246-1250.
[7] DuMouchel, W. H. and Harris, J. E. (1983). Bayes methods for combining the results of cancer studies in humans and other species. J. Amer. Statist. Assoc. 78 293-315. · Zbl 0528.62089 · doi:10.2307/2288631
[8] Durbin, R. M., Altshuler, D. L., Abecasis, G. R., Bentley, D. R., Chakravarti, A. et al. (2010). A map of human genome variation from population-scale sequencing. Nature 467 1061-1073.
[9] Eddy, D. M., Hasselblad, V. and Schachter, R. (1990). A Bayesian method for synthesizing evidence. International Journal of Technical Assistance in Health Care 6 31-55.
[10] Fledel-Alon, A., Leffler, E. M., Guan, Y., Stephens, M., Coop, G. et al. (2011). Variation in human recombination rates and its genetic determinants. PloS One 6 e20321.
[11] Flutre, T., Wen, X., Pritchard, J. K. and Stephens, M. (2013). A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genetics 9 e1003486.
[12] Gilad, Y., Rifkin, S. A. and Pritchard, J. K. (2008). Revealing the architecture of gene regulation: The promise of eQTL studies. Trends Genet. 24 408-415.
[13] Givens, G. H., Smith, D. D. and Tweedie, R. L. (1997). Publication bias in meta-analysis: A Bayesian data-augmentation approach to account for issues exemplified in the passive smoking debate. Statist. Sci. 12 221-250.
[14] Guan, Y. and Stephens, M. (2008). Practical issues in imputation-based association mapping. PLoS Genetics 4 e1000279.
[15] Han, B. and Eskin, E. (2011). Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 88 586-598.
[16] Johnson, V. E. (2005). Bayes factors based on test statistics. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 689-701. · Zbl 1101.62016 · doi:10.1111/j.1467-9868.2005.00521.x
[17] Johnson, V. E. (2008). Properties of Bayes factors based on test statistics. Scand. J. Stat. 35 354-368. · Zbl 1157.62013 · doi:10.1111/j.1467-9469.2007.00576.x
[18] Kong, A., Thorleifsson, G., Stefansson, H., Masson, G. et al. (2008). Sequence variants in the RNF212 gene associate with genome-wide recombination rate. Science 319 1398-1401.
[19] Lebrec, J. J., Stijnen, T. and van Houwelingen, H. C. (2010). Dealing with heterogeneity between cohorts in genomewide SNP association studies dealing with heterogeneity between cohorts in genomewide SNP association studies. Stat. Appl. Genet. Mol. Biol. 9 Art. 8, 22 pp. · Zbl 1304.92092 · doi:10.2202/1544-6115.1503
[20] Li, Z. and Begg, C. B. (1994). Random effects models for combining results from controlled and uncontrolled studies in a meta-analysis. J. Amer. Statist. Assoc. 89 1523-1527. · Zbl 0825.62865 · doi:10.2307/2291015
[21] Mila, A. L. and Ngugi, H. K. (2011). A Bayesian approach to meta-analysis of plant pathology studies. Phytopathology 101 42-51.
[22] Owen, A. B. (2009). Karl Pearson’s meta-analysis revisited. Ann. Statist. 37 3867-3892. · Zbl 1191.62023 · doi:10.1214/09-AOS697
[23] Pickrell, J. K., Marioni, J. C., Pai, A. A., Degner, J. F. et al. (2010). Understanding mechanisms underlying human gene expression variation with RNA sequencing Nature 464 768-772.
[24] Servin, B. and Stephens, M. (2008). Imputation-based analysis of association studies: Candidate regions and quantitative traits. PLoS Genetics 3 e114.
[25] Stangl, D. K. and Berry, D. A. (2000). Meta-Analysis in Medicine and Health Policy . Dekker, New York.
[26] Stephens, M. (2013). A unified framework for association analysis with multiple related phenotypes. PLoS One 8 e65245.
[27] Stranger, B. E., Nica, A. C., Forrest, M. S., Dimas, A., Bird, C. P. et al. (2007). Population genomics of human gene expression. Nat. Genet. 39 1217-1224.
[28] Sutton, A. J. and Abrams, K. R. (2001). Bayesian methods in meta-analysis and evidence synthesis. Stat. Methods Med. Res. 10 277-303. · Zbl 1121.62667 · doi:10.1177/096228020101000404
[29] Teslovich, T. M., Musunuru, K., Smith, A. V., Edmondson, A. C., Stylianou, I. M. et al. (2010). Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466 707-713.
[30] Verzilli, C. J., Shah, T., Casas, J. P., Chapman, J., Sandhu, M. et al. (2008). Bayesian meta-analysis of genetic association studies with different sets of markers. Am. J. Hum. Genet. 82 859-872.
[31] Wakefield, J. (2009). Bayes factors for genome-wide association studies: Comparison with \(P\)-values. Genet. Epidemiol. 33 79-86.
[32] Wen, X. (2011). Bayesian analysis of genetic association data, accounting for heterogeneity. Ph.D. thesis, Dept. Statistics, Univ. Chicago.
[33] Wen, X. and Stephens, M. (2014). Supplement to “Bayesian methods for genetic association analysis with heterogeneous subgroups: From meta-analyses to gene-environment interactions.” . · Zbl 1454.62418 · doi:10.1214/13-AOAS695
[34] Whitehead, A. and Whitehead, J. (1991). A general parametric approach to the meta-analysis of randomized clinical trials. Stat. Med. 10 1665-1677.
[35] Willer, C. J., Li, Y. and Abecasis, G. R. (2010). METAL: Fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26 2190-2191.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.