##
**Nested Markov properties for acyclic directed mixed graphs.**
*(English)*
Zbl 07684015

Summary: Conditional independence models associated with directed acyclic graphs (DAGs) may be characterized in at least three different ways: via a factorization, the global Markov property (given by the d-separation criterion), and the local Markov property. Marginals of DAG models also imply equality constraints that are not conditional independences; the well-known “Verma constraint” is an example. Constraints of this type are used for testing edges, and in a computationally efficient marginalization scheme via variable elimination.

We show that equality constraints like the “Verma constraint” can be viewed as conditional independences in kernel objects obtained from joint distributions via a fixing operation that generalizes conditioning and marginalization. We use these constraints to define, via ordered local and global Markov properties, and a factorization, a graphical model associated with acyclic directed mixed graphs (ADMGs). We prove that marginal distributions of DAG models lie in this model, and that a set of these constraints given by Tian provides an alternative definition of the model. Finally, we show that the fixing operation used to define the model leads to a particularly simple characterization of identifiable causal effects in hidden variable causal DAG models.

We show that equality constraints like the “Verma constraint” can be viewed as conditional independences in kernel objects obtained from joint distributions via a fixing operation that generalizes conditioning and marginalization. We use these constraints to define, via ordered local and global Markov properties, and a factorization, a graphical model associated with acyclic directed mixed graphs (ADMGs). We prove that marginal distributions of DAG models lie in this model, and that a set of these constraints given by Tian provides an alternative definition of the model. Finally, we show that the fixing operation used to define the model leads to a particularly simple characterization of identifiable causal effects in hidden variable causal DAG models.

### Software:

TETRAD
PDFBibTeX
XMLCite

\textit{T. S. Richardson} et al., Ann. Stat. 51, No. 1, 334--361 (2023; Zbl 07684015)

### References:

[1] | ALI, R. A., RICHARDSON, T. S. and SPIRTES, P. (2009). Markov equivalence for ancestral graphs. Ann. Statist. 37 2808-2837. · Zbl 1178.68574 · doi:10.1214/08-AOS626 |

[2] | ALLMAN, E. S., RHODES, J. A., STANGHELLINI, E. and VALTORTA, M. (2015). Parameter identifiability of discrete Bayesian networks with hidden variables. J. Causal Inference 3 189-205. · doi:10.1515/jci-2014-0021 |

[3] | BELL, J. S. (1964). On the Einstein Podolsky Rosen paradox. Phys. Phys. Fiz. 1 195-200. · doi:10.1103/PhysicsPhysiqueFizika.1.195 |

[4] | BHATTACHARYA, R., NAGARAJAN, T., MALINSKY, D. and SHPITSER, I. (2021). Differentiable causal discovery under unmeasured confounding. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2021). |

[5] | BONET, B. (2001). Instrumentality tests revisited. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence 48-55. Morgan Kaufmann Publishers Inc. |

[6] | CANIGLIA, E. C., ROBINS, J. M., CAIN, L. E. et al. (2019). Emulating a trial of joint dynamic strategies: An application to monitoring and treatment of HIV-positive individuals. Stat. Med. 38 2428-2446. · doi:10.1002/sim.8120 |

[7] | CLAUSER, J. F., HORNE, M. A., SHIMONY, A. and HOLT, R. A. (1969). Proposed experiment to test local hidden-variable theories. Phys. Rev. Lett. 23 880. · Zbl 1371.81014 |

[8] | CONSTANTINOU, P. (2013). Conditional Independence and Applications in Statistical Causality. Ph.D. thesis, Dept. Pure Mathematics and Mathematical Statistics, Univ. Cambridge. |

[9] | Dawid, A. P. (1979). Conditional independence in statistical theory. J. Roy. Statist. Soc. Ser. B 41 1-31. · Zbl 0408.62004 |

[10] | DAWID, A. (2002). Influence diagrams for causal modelling and inference. Int. Stat. Rev. 70 161-189. · Zbl 1215.62002 |

[11] | EVANS, R. J. (2012). Graphical methods for inequality constraints in marginalized DAGs. In Machine Learning for Signal Processing (MLSP). |

[12] | Evans, R. J. (2016). Graphs for margins of Bayesian networks. Scand. J. Stat. 43 625-648. · Zbl 1468.62300 · doi:10.1111/sjos.12194 |

[13] | Evans, R. J. (2018). Margins of discrete Bayesian networks. Ann. Statist. 46 2623-2656. · Zbl 1408.62044 · doi:10.1214/17-AOS1631 |

[14] | EVANS, R. J. and RICHARDSON, T. S. (2010). Maximum likelihood fitting of acyclic directed mixed graphs to binary data. In Proceedings of the Twenty Sixth Conference on Uncertainty in Artificial Intelligence 26. |

[15] | EVANS, R. J. and RICHARDSON, T. S. (2019). Smooth, identifiable supermodels of discrete DAG models with latent variables. Bernoulli 25 848-876. · Zbl 1431.62230 · doi:10.3150/17-bej1005 |

[16] | HUANG, Y. and VALTORTA, M. (2006). Pearl’s calculus of interventions is complete. In Twenty Second Conference on Uncertainty in Artificial Intelligence 217-224. |

[17] | KÉDAGNI, D.and MOURIFIÉ, I. (2020). Generalized instrumental inequalities: Testing the instrumental variable independence assumption. Biometrika 107 661-675. · Zbl 1451.62046 · doi:10.1093/biomet/asaa003 |

[18] | KREIF, N., SOFRYGIN, O., SCHMITTDIEL, J. A., ADAMS, A. S., GRANT, R. W., ZHU, Z., VAN DER LAAN, M. J. and NEUGEBAUER, R. (2021). Exploiting nonsystematic covariate monitoring to broaden the scope of evidence about the causal effects of adaptive treatment strategies. Biometrics 77 329-342. · Zbl 1520.62252 · doi:10.1111/biom.13271 |

[19] | Lauritzen, S. L. (1996). Graphical Models. Oxford Statistical Science Series 17. The Clarendon Press, Oxford University Press, New York. · Zbl 0907.62001 |

[20] | NAVASCUÉS, M. and WOLFE, E. (2020). The inflation technique completely solves the causal compatibility problem. J. Causal Inference 8 70-91. · doi:10.1515/jci-2017-0020 |

[21] | NEUGEBAUER, R., SCHMITTDIEL, J. A., ADAMS, A. S., GRANT, R. W. and VAN DER LAAN, M. J. (2017). Identification of the joint effect of a dynamic treatment intervention and a stochastic monitoring intervention under the no direct effect assumption. J. Causal Inference 5 Art. No. 20160015. · doi:10.1515/jci-2016-0015 |

[22] | Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. The Morgan Kaufmann Series in Representation and Reasoning. Morgan Kaufmann, San Mateo, CA. |

[23] | PEARL, J. (1995). On the testability of causal models with latent and instrumental variables. In Uncertainty in Artificial Intelligence (Montreal, PQ, 1995) 435-443. Morgan Kaufmann, San Francisco, CA. |

[24] | Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge Univ. Press, Cambridge. · Zbl 0959.68116 |

[25] | PEARL, J. and VERMA, T. S. (1991). A theory of inferred causation. In Principles of Knowledge Representation and Reasoning (Cambridge, MA, 1991). Morgan Kaufmann Ser. Represent. Reason. 441-452. Morgan Kaufmann, San Mateo, CA. · Zbl 0765.68177 |

[26] | PERKOVIĆ, E., TEXTOR, J., KALISCH, M. and MAATHUIS, M. H. (2015). A complete generalized adjustment criterion. In Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence (UAI). |

[27] | Richardson, T. (2003). Markov properties for acyclic directed mixed graphs. Scand. J. Stat. 30 145-157. · Zbl 1035.60005 · doi:10.1111/1467-9469.00323 |

[28] | Richardson, T. and Spirtes, P. (2002). Ancestral graph Markov models. Ann. Statist. 30 962-1030. · Zbl 1033.60008 · doi:10.1214/aos/1031689015 |

[29] | RICHARDSON, T. S., EVANS, R. J., ROBINS, J. M. and SHPITSER, I. (2023). Supplement to “Nested Markov properties for acyclic directed mixed graphs.” https://doi.org/10.1214/22-AOS2253SUPP |

[30] | ROBINS, J. (1986). A new approach to causal inference in mortality studies with sustained exposure periods—application to control of the healthy worker survivor effect. Math. Model. 7 1393-1512. · Zbl 0614.62136 |

[31] | ROBINS, J. M. (1999). Testing and estimation of direct effects by reparameterizing directed acyclic graphs with structural nested models. In Computation, Causation, and Discovery (C. Glymour and G. Cooper, eds.) 349-405. AAAI Press, Menlo Park, CA. |

[32] | ROBINS, J. M. and WASSERMAN, L. (1997). Estimation of effects of sequential treatments by reparameterizing directed acyclic graphs. In Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence 309-420. Morgan Kaufmann, San Mateo, CA. |

[33] | SADEGHI, K. and LAURITZEN, S. (2014). Markov properties for mixed graphs. Bernoulli 20 676-696. · Zbl 1303.60064 · doi:10.3150/12-BEJ502 |

[34] | SHPITSER, I., EVANS, R. J. and RICHARDSON, T. S. (2018). Acyclic linear SEMs obey the nested Markov property. In Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI). |

[35] | SHPITSER, I. and PEARL, J. (2006). Identification of joint interventional distributions in recursive semi-Markovian causal models. In Twenty-First National Conference on Artificial Intelligence 2 1219-1226. AAAI Press, Washington, DC. |

[36] | SHPITSER, I. and PEARL, J. (2008). Dormant independence. In Proceedings of the Twenty Third Conference on Artificial Intelligence (AAAI 2008) 1081-1087. AAAI Press, Washington, DC. |

[37] | SHPITSER, I., RICHARDSON, T. S. and ROBINS, J. M. (2009). Testing edges by truncations. In International Joint Conference on Artificial Intelligence 21 1957-1963. |

[38] | SHPITSER, I., RICHARDSON, T. S. and ROBINS, J. M. (2011). An efficient algorithm for computing interventional distributions in latent variable causal models. In 27th Conference on Uncertainty in Artificial Intelligence (UAI-11) AUAI Press. |

[39] | SHPITSER, I., EVANS, R. J., RICHARDSON, T. S. and ROBINS, J. (2014). An introduction to nested Markov models. Behaviormetrika 41 3-39. |

[40] | SPIRTES, P., GLYMOUR, C. and SCHEINES, R. (1993). Causation, Prediction, and Search. Lecture Notes in Statistics 81. Springer, New York. · Zbl 0806.62001 · doi:10.1007/978-1-4612-2748-9 |

[41] | STROTZ, R. H. and WOLD, H. O. A. (1960). Recursive vs. nonrecursive systems: An attempt at synthesis. Econometrica 28 417-427. · doi:10.2307/1907731 |

[42] | STUDENÝ, M. (1992). Conditional independence relations have no finite complete characterization. In Information Theory, Statistical Decision Functions and Random Processes. Transactions of the 11th Prague Conference Vol. B 377-396. Kluwer, Dordrecht. · Zbl 0764.60004 |

[43] | TIAN, J. and PEARL, J. (2002). On the testable implications of causal models with hidden variables. In Proceedings of UAI-02 519-527. |

[44] | VERMA, T. S. and PEARL, J. (1990). Equivalence and synthesis of causal models. Technical Report R-150, Department of Computer Science, Univ. California, Los Angeles. |

[45] | WERMUTH, N. (2011). Probability distributions with summary graph structure. Bernoulli 17 845-879. · Zbl 1245.62062 · doi:10.3150/10-BEJ309 |

[46] | WERMUTH, N. and COX, D. R. (2008). Distortion of effects caused by indirect confounding. Biometrika 95 17-33. · Zbl 1437.62654 · doi:10.1093/biomet/asm092 |

[47] | WERMUTH, N., COX, D. and PEARL, J. (1996). Explanations for multivariate structures derived from univariate recursive regressions. Ber. Stoch. Verw. Geb., Univ. Mainz 94 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.