Causal graphical models in systems genetics: a unified framework for joint inference of causal network and genetic architecture for correlated phenotypes. (English) Zbl 1189.62172

Summary: Causal inference approaches in systems genetics exploit quantitative trait loci (QTL) genotypes to infer causal relationships among phenotypes. The genetic architecture of each phenotype may be complex, and poorly estimated genetic architectures may compromise the inference of causal relationships among phenotypes. Existing methods assume QTLs are known or inferred without regard to the phenotype network structure. We develop a QTL-driven phenotype network method (QTLnet) to jointly infer a causal phenotype network and associated genetic architecture for sets of correlated phenotypes. Randomization of alleles during meiosis and the unidirectional influence of genotype on phenotype allow the inference of QTLs causal to phenotypes. Causal relationships among phenotypes can be inferred using these QTL nodes, enabling us to distinguish among phenotype networks that would otherwise be distribution equivalent. We jointly model phenotypes and QTLs using homogeneous conditional Gaussian regression models, and we derive a graphical criterion for distribution equivalence. We validate the QTLnet approach in a simulation study. Finally, we illustrate with simulated data and a real example how QTLnet can be used to infer both direct and indirect effects of QTLs and phenotypes that co-map to a genomic region.


62P10 Applications of statistics to biology and medical sciences; meta analysis
92D10 Genetics and epigenetics
05C90 Applications of graph theory
65C60 Computational problems in statistics (MSC2010)
62J99 Linear inference, regression


Full Text: DOI arXiv


[1] Andrei, A. and Kendziorski, C. (2009). An efficient method for identifying statistical interactors in graphical models. Biostatistics 10 706-718.
[2] Aten, J. E., Fuller, T. F., Lusis, A. J. and Horvath, S. (2008). Using genetic markers to orient the edges in quantitative trait networks: The NEO software. BMC Sys. Biol. 2 34.
[3] Banerjee, S., Yandell, B. S. and Yi, N. (2008). Bayesian QTL mapping for multiple traits. Genetics 179 2275-2289.
[4] Breitling, R., Li, Y., Tesson, B. M., Fu, J., Wu, C., Wiltshire, T., Gerrits, A., Bystrykh, L. V., de Haan, G., Su, A. I. and Jansen, R. C. (2008). Genetical genomics: Spotlight on QTL hotspots. PLoS Genet. 4 e1000232.
[5] Broman, K., Wu, H., Sen, S. and Churchill, G. A. (2003). R/qtl: QTL mapping in experimental crosses. Bioinformatics 19 889-890.
[6] Chaibub Neto, E., Ferrara, C., Attie, A. D. and Yandell, B. S. (2008). Inferring causal phenotype networks from segregating populations. Genetics 179 1089-1100. · Zbl 1189.62172
[7] Chaibub Neto, E., Keller, M. P., Attie, A. D. and Yandell, B. S. (2009). Supplement to “Causal graphical models in systems genetics: A unified framework for joint inference of causal network and genetic architecture for correlated phenotypes.” DOI: . · Zbl 1189.62172
[8] Chen, L. S., Emmert-Streib, F. and Storey, J. D. (2007). Harnessing naturally randomized transcription to infer regulatory relationships among genes. Genome Biology 8 R219.
[9] Crick, F. H. C. (1958). On Protein Synthesis. Symp. Soc. Exp. Biol. XII 139-163.
[10] Dawid, P. (2007). Fundamentals of statistical causality. Research Report 279, Dept. Statistical Science, Univ. College London. · Zbl 1108.62009
[11] Doss, S., Schadt, E. E., Drake, T. A., Lusis, A. J. (2005). Cis-acting expression quantitative trait loci in mice. Genome Research 15 681-691.
[12] Ghazalpour, A., Doss, S., Zhang, B., Wang, S., Plaisier, C., Castellanos, R., Brozell, A., Schadt, E. E., Drake, T. A., Lusis, A. J. and Horvath, S. (2006). Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genetics 2 e130.
[13] Grzegorczyk, M. and Husmeier, D. (2008). Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move. Machine Learning 71 265-305.
[14] Haley, C. and Knott, S. (1992). A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69 315-324.
[15] Heckerman, D., Geiger, D. and Chickering, D. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20 197-243. · Zbl 0831.68096
[16] Hoeting, J. A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999). Bayesian model averaging: A tutorial (with discussion and rejoinder by authors). Statist. Sci. 14 382-417. · Zbl 1059.62525
[17] Husmeier, D. (2003). Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics 19 2271-2282. · Zbl 1151.92011
[18] Kulp, D. C. and Jagalur, M. (2006). Causal inference of regulator-target pairs by gene mapping of expression phenotypes. BMC Genomics 7 125.
[19] Lauritzen, S. (1996). Graphical Models. Oxford Statistical Science Series 17 . Oxford Univ. Press, New York. · Zbl 0907.62001
[20] Li, R., Tsaih, S. W., Shockley, K., Stylianou, I. M., Wergedal, J., Paigen, B. and Churchill, G. A. (2006). Structural model analysis of multiple quantitative traits. PLoS Genetics 2 e114.
[21] Liu, B., de la Fuente, A. and Hoeschele, I. (2008). Gene network inference via structural equation modeling in genetical genomics experiments. Genetics 178 1763-1776.
[22] Madigan, D. and Raftery, J. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Amer. Statist. Assoc. 89 1535-1546. · Zbl 0814.62030
[23] Madigan, D. and York, J. (1995). Bayesian graphical models for discrete data. Int. Stat. Rev. 63 215-232. · Zbl 0834.62003
[24] Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference . Kaufmann, San Mateo, CA. · Zbl 0746.68089
[25] Pearl, J. (2000). Causality: Models, Reasoning and Inference . Cambridge Univ. Press, New York. · Zbl 0959.68116
[26] Riggelsen, C. (2005). MCMC learning of Bayesian network models by Markov blanket decomposition. In Lecture Notes in Computer Science 329-340. Springer, Berlin.
[27] Schadt, E. E., Lamb, J., Yang, X., Zhu, J., Edwards, S., Guhathakurta, D., Sieberts, S. K., Monks, S., Reitman, M., Zhang, C., Lum, P. Y., Leonardson, A., Thieringer, R., Metzger, J. M., Yang, L., Castle, J., Zhu, H., Kash, S. F., Drake, T. A., Sachs, A. and Lusis, A. J. (2005). An integrative genomics approach to infer causal associations between gene expression and disease. Nature Genetics 37 710-717.
[28] Sen, S. and Churchill, G. A. (2001). A statistical framework for quantitative trait mapping. Genetics 159 371-387.
[29] Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction and Search , 2nd ed. MIT Press, Cambridge, MA. · Zbl 0806.62001
[30] Verma, T. and Pearl, J. (1990). Equivalence and synthesis of causal models. In Readings in Uncertain Reasoning (G. Shafer and J. Pearl, eds.). Kaufmann, Boston.
[31] Wang, S., Yehya, N., Schadt, E. E., Wang, H., Drake, T. A. and Lusis, A. J. (2006). Genetic and genomic analysis of a fat mass trait with complex inheritance reveals marked sex specificity. PLoS Genetics 2 e15.
[32] Winrow, C. J., Williams, D. L., Kasarskis, A., Millstein, J., Laposky, A. D., Yang, H. S., Mrazek, K., Zhou, L., Owens, J. R., Radzicki, D., Preuss, F., Schadt, E. E., Shimomura, K., Vitaterna, M. H., Zhang, C., Koblan, K. S., Renger, J. J. and Turek, F. W. (2009). Uncovering the genetic landscape for multiple sleep-wake traits. PLoS ONE 4 e5161.
[33] Wright, S. (1934). The method of path coefficients. Ann. Math. Statist. 5 161-215. · Zbl 0010.31305
[34] Zeng, Z. B., Wang, T. and Zou, W. (2005). Modeling quantitative trait loci and interpretation of models. Genetics 169 1711-1725.
[35] Zhu, J., Zhang, B., Smith, E. N., Drees, B., Brem, R. B., Kruglyak, L., Bumgarner, R. E. and Schadt, E. E. (2008). Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genetics 40 854-861.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.