Objective Bayes model selection of Gaussian interventional essential graphs for the identification of signaling pathways. (English) Zbl 1435.62435

Summary: A signalling pathway is a sequence of chemical reactions initiated by a stimulus which in turn affects a receptor, and then through some intermediate steps cascades down to the final cell response. Based on the technique of flow cytometry, samples of cell-by-cell measurements are collected under each experimental condition, resulting in a collection of interventional data (assuming no latent variables are involved). Usually several external interventions are applied at different points of the pathway, the ultimate aim being the structural recovery of the underlying signalling network which we model as a causal Directed Acyclic Graph (DAG) using intervention calculus. The advantage of using interventional data, rather than purely observational one, is that identifiability of the true data generating DAG is enhanced. More technically a Markov equivalence class of DAGs, whose members are statistically indistinguishable based on observational data alone, can be further decomposed, using additional interventional data, into smaller distinct Interventional Markov equivalence classes. We present a Bayesian methodology for structural learning of Interventional Markov equivalence classes based on observational and interventional samples of multivariate Gaussian observations. Our approach is objective, meaning that it is based on default parameter priors requiring no personal elicitation; some flexibility is however allowed through a tuning parameter which regulates sparsity in the prior on model space. Based on an analytical expression for the marginal likelihood of a given Interventional Essential Graph, and a suitable MCMC scheme, our analysis produces an approximate posterior distribution on the space of Interventional Markov equivalence classes, which can be used to provide uncertainty quantification for features of substantive scientific interest, such as the posterior probability of inclusion of selected edges, or paths.


62P30 Applications of statistics in engineering and industry; control charts
62H22 Probabilistic graphical models
62P10 Applications of statistics to biology and medical sciences; meta analysis
62H12 Estimation in multivariate analysis


glasso; TETRAD
Full Text: DOI Euclid


[1] Andersson, S. A., Madigan, D. and Perlman, M. D. (1997a). A characterization of Markov equivalence classes for acyclic digraphs. Ann. Statist. 25 505-541. · Zbl 0876.60095
[2] Andersson, S. A., Madigan, D. and Perlman, M. D. (1997b). On the Markov equivalence of chain graphs, undirected graphs, and acyclic digraphs. Scand. J. Stat. 24 81-102. · Zbl 0918.60050
[3] Andersson, S. A., Madigan, D. and Perlman, M. D. (2001). Alternative Markov properties for chain graphs. Scand. J. Stat. 28 33-85. · Zbl 0972.60067
[4] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289-300. · Zbl 0809.62014
[5] Cao, X., Khare, K. and Ghosh, M. (2019). Posterior graph selection and estimation consistency for high-dimensional Bayesian DAG models. Ann. Statist. 47 319-348. · Zbl 1417.62140
[6] Castelletti, F. and Consonni, G. (2019). Supplement to “Objective Bayes model selection of Gaussian interventional essential graphs for the identification of signaling pathways.” DOI:10.1214/19-AOAS1275SUPP.
[7] Castelletti, F., Consonni, G., Della Vedova, M. L. and Peluso, S. (2018). Learning Markov equivalence classes of directed acyclic graphs: An objective Bayes approach. Bayesian Anal. 13 1231-1256. · Zbl 1407.62189
[8] Chickering, D. M. (2002). Learning equivalence classes of Bayesian-network structures. J. Mach. Learn. Res. 2 445-498. · Zbl 1007.68179
[9] Consonni, G., La Rocca, L. and Peluso, S. (2017). Objective Bayes covariate-adjusted sparse graphical model selection. Scand. J. Stat. 44 741-764. · Zbl 06774144
[10] Cowell, R. G., Dawid, A. P., Lauritzen, S. L. and Spiegelhalter, D. J. (1999). Probabilistic Networks and Expert Systems. Statistics for Engineering and Information Science. Springer, New York. · Zbl 0937.68121
[11] Foygel, R. and Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. Adv. Neural Inf. Process. Syst. 23 2020-2028.
[12] Friedman, N. (2004). Inferring cellular networks using probabilistic graphical models. Science 303 799-805.
[13] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432-441. · Zbl 1143.62076
[14] Geiger, D. and Heckerman, D. (2002). Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. Ann. Statist. 30 1412-1440. · Zbl 1016.62064
[15] Geisser, S. and Cornfield, J. (1963). Posterior distributions for multivariate normal parameters. J. Roy. Statist. Soc. Ser. B 25 368-376. · Zbl 0124.35304
[16] Gelman, A., Meng, X.-L. and Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statist. Sinica 6 733-807. · Zbl 0859.62028
[17] Gillispie, S. B. and Perlman, M. D. (2002). The size distribution for Markov equivalence classes of acyclic digraph models. Artificial Intelligence 141 137-155. · Zbl 1043.68096
[18] Hauser, A. and Bühlmann, P. (2012). Characterization and greedy learning of interventional Markov equivalence classes of directed acyclic graphs. J. Mach. Learn. Res. 13 2409-2464. · Zbl 1433.68346
[19] Hauser, A. and Bühlmann, P. (2015). Jointly interventional and observational data: Estimation of interventional Markov equivalence classes of directed acyclic graphs. J. R. Stat. Soc. Ser. B. Stat. Methodol. 77 291-318. · Zbl 1414.62021
[20] He, Y.-B. and Geng, Z. (2008). Active learning of causal networks with intervention experiments and optimal designs. J. Mach. Learn. Res. 9 2523-2547. · Zbl 1225.68184
[21] He, Y., Jia, J. and Yu, B. (2013). Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs. Ann. Statist. 41 1742-1779. · Zbl 1360.62369
[22] Hoijtink, H. (2013). Objective Bayes factors for inequality constrained hypotheses. Int. Stat. Rev. 81 207-229. · Zbl 1416.62065
[23] Lauritzen, S. L. (1996). Graphical Models. Oxford Statistical Science Series 17. Oxford Univ. Press, New York.
[24] Luo, R. and Zhao, H. (2011). Bayesian hierarchical modeling for signaling pathway inference from single cell interventional data. Ann. Appl. Stat. 5 725-745. · Zbl 1223.62014
[25] Maathuis, M. H., Kalisch, M. and Bühlmann, P. (2009). Estimating high-dimensional intervention effects from observational data. Ann. Statist. 37 3133-3164. · Zbl 1191.62118
[26] Ness, R. O., Sachs, K., Mallick, P. and Vitek, O. (2017). A Bayesian active learning experimental design for inferring signaling networks. In Research in Computational Molecular Biology. Lecture Notes in Computer Science 10229 134-156. Springer, Cham.
[27] Pearl, J. (1995). Causal diagrams for empirical research. Biometrika 82 669-710. · Zbl 0860.62045
[28] Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge Univ. Press, Cambridge. · Zbl 0959.68116
[29] Peters, J. and Bühlmann, P. (2014). Identifiability of Gaussian structural equation models with equal error variances. Biometrika 101 219-228. · Zbl 1285.62005
[30] Peterson, C., Stingo, F. C. and Vannucci, M. (2015). Bayesian inference of multiple Gaussian graphical models. J. Amer. Statist. Assoc. 110 159-174. · Zbl 1373.62106
[31] Richardson, T. and Spirtes, P. (2002). Ancestral graph Markov models. Ann. Statist. 30 962-1030. · Zbl 1033.60008
[32] Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. and Nolan, G. (2005). Causal protein-signaling networks derived from multiparameter single-cell data. Science 308 523-529.
[33] Shojaie, A. and Michailidis, G. (2009). Analysis of gene sets based on the underlying regulatory network. J. Comput. Biol. 16 407-426.
[34] Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction, and Search, 2nd ed. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA. · Zbl 0806.62001
[35] Tan, L. S. L., Jasra, A., De Iorio, M. and Ebbels, T. M. D. (2017). Bayesian inference for multiple Gaussian graphical models with application to metabolic association networks. Ann. Appl. Stat. 11 2222-2251. · Zbl 1383.62294
[36] Verma, T. and Pearl, J. (1991). Equivalence and synthesis of causal models. In Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence, UAI 90 255-270. Elsevier, New York.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.