×

Identifiability of Gaussian linear structural equation models with homogeneous and heterogeneous error variances. (English) Zbl 1485.62073

Summary: In this work, we consider the identifiability assumption of Gaussian linear structural equation models (SEMs) in which each variable is determined by a linear function of its parents plus normally distributed error. It has been shown that linear Gaussian structural equation models are fully identifiable if all error variances are the same or known. Hence, this work proves the identifiability of Gaussian SEMs with both homogeneous and heterogeneous unknown error variances. Our new identifiability assumption exploits not only error variances, but edge weights; hence, it is strictly milder than prior work on the identifiability result. We further provide a structure learning algorithm that is statistically consistent and computationally feasible, based on our new assumption. The proposed algorithm assumes that all relevant variables are observed, while it does not assume causal minimality and faithfulness. We verify our theoretical findings through simulations and real multivariate data, and compare our algorithm to state-of-the-art PC, GES and GDS algorithms.

MSC:

62H22 Probabilistic graphical models
62-08 Computational methods for problems pertaining to statistics
68T05 Learning and adaptive systems in artificial intelligence

Software:

DirectLiNGAM; TETRAD; MIM
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Bühlmann, P.; Peters, J.; Ernest, J., Cam: Causal additive models, high-dimensional order search and penalized regression, The Annals of Statistics, 42, 2526-2556 (2014) · Zbl 1309.62063 · doi:10.1214/14-AOS1260
[2] Chickering, D. M. (1996). Learning bayesian networks is np-complete, In Learning from data. Springer, pp. 121-130.
[3] Chickering, DM, Optimal structure identification with greedy search, The Journal of Machine Learning Research, 3, 507-554 (2003) · Zbl 1084.68519
[4] Chickering, D. M., Geiger, D., & Heckerman, D., et al. (1994). Learning Bayesian networks is NP-hard. Technical Report. Citeseer. · Zbl 0831.68096
[5] Doya, K., Bayesian brain: Probabilistic approaches to neural coding (2007), Cambridge: MIT Press, Cambridge · Zbl 1137.91015
[6] Edwards, D., Introduction to graphical modelling (2012), New York: Springer, New York · Zbl 0952.62003
[7] Friedman, N.; Linial, M.; Nachman, I.; Pe’er, D., Using Bayesian networks to analyze expression data, Journal of computational biology, 7, 601-620 (2000) · doi:10.1089/106652700750050961
[8] Ghoshal, A., & Honorio, J. (2017). Learning identifiable gaussian bayesian networks in polynomial time and sample complexity. In: Advances in Neural Information Processing Systems, pp. 6457-6466.
[9] Harary, F. (1973). New directions in the theory of graphs. Technical Report. Michigan Univ. Ann Arbor Dept. of Mathematics. · Zbl 0253.00004
[10] Hoyer, P. O., Janzing, D., Mooij, J. M., Peters, J., & Schölkopf, B. (2009). Nonlinear causal discovery with additive noise models. In Advances in neural information processing systems, pp. 689-696.
[11] Kalisch, M.; Bühlmann, P., Estimating high-dimensional directed acyclic graphs with the PC-algorithm, Journal of Machine Learning Research, 8, 613-636 (2007) · Zbl 1222.68229
[12] Kephart, J.O., & White, S.R. (1991). Directed-graph epidemiological models of computer viruses, In Research in Security and Privacy, 1991. Proceedings., 1991 IEEE Computer Society Symposium on IEEE. pp. 343-359.
[13] Lauritzen, SL, Graphical models (1996), Oxford: Oxford University Press, Oxford · Zbl 0907.62001
[14] Loh, PL; Bühlmann, P., High-dimensional learning of linear causal networks via inverse covariance estimation, The Journal of Machine Learning Research, 15, 3065-3105 (2014) · Zbl 1318.68148
[15] Mooij, J., Janzing, D., Peters, J., & Schölkopf, B. (2009). Regression by dependence minimization and its application to causal inference in additive noise models, In Proceedings of the 26th annual international conference on machine learning, ACM. pp. 745-752.
[16] Park, G., & Park, H. (2019). Identifiability of generalized hypergeometric distribution (ghd) directed acyclic graphical models. In Proceedings of Machine Learning Research, PMLR. pp. 158-166.
[17] Park, G., & Raskutti, G. (2015). Learning large-scale poisson dag models based on overdispersion scoring. In Advances in Neural Information Processing Systems, pp. 631-639.
[18] Park, G.; Raskutti, G., Learning quadratic variance function (qvf) dag models via overdispersion scoring (ods), Journal of Machine Learning Research, 18, 1-44 (2018) · Zbl 1470.68156
[19] Pearl, J., Probabilistic reasoning in intelligent systems: Networks of plausible inference (2014), Amsterdam: Elsevier, Amsterdam · Zbl 0746.68089
[20] Peters, J.; Bühlmann, P., Identifiability of gaussian structural equation models with equal error variances, Biometrika, 101, 219-228 (2014) · Zbl 1285.62005 · doi:10.1093/biomet/ast043
[21] Peters, J., Mooij, J., Janzing, D., & Schölkopf, B. (2012). Identifiability of causal graphs using functional models. arXiv preprint arXiv:1202.3757 . · Zbl 1318.68151
[22] Shimizu, S.; Hoyer, PO; Hyvärinen, A.; Kerminen, A., A linear non-Gaussian acyclic model for causal discovery, The Journal of Machine Learning Research, 7, 2003-2030 (2006) · Zbl 1222.68304
[23] Shimizu, S.; Inazumi, T.; Sogawa, Y.; Hyvärinen, A.; Kawahara, Y.; Washio, T.; Hoyer, PO; Bollen, K., Directlingam: A direct method for learning a linear non-gaussian structural equation model, Journal of Machine Learning Research, 12, 1225-1248 (2011) · Zbl 1280.68195
[24] Spirtes, P. (1995). Directed cyclic graphical representations of feedback models, In Proceedings of the Eleventh conference on Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc.. pp. 491-498.
[25] Spirtes, P.; Glymour, CN; Scheines, R., Causation, prediction, and search (2000), Cambridge: MIT Press, Cambridge · Zbl 0806.62001
[26] Tsamardinos, I., & Aliferis, C. F. (2003). Towards principled feature selection: Relevancy, filters and wrappers, In Proceedings of the ninth international workshop on Artificial Intelligence and Statistics, Morgan Kaufmann Publishers: Key West, FL, USA.
[27] Tsamardinos, I.; Brown, LE; Aliferis, CF, The max-min hill-climbing bayesian network structure learning algorithm, Machine Learning, 65, 31-78 (2006) · Zbl 1470.68192 · doi:10.1007/s10994-006-6889-7
[28] Uhler, C., Raskutti, G., Bühlmann, P., & Yu, B. (2013). Geometry of the faithfulness assumption in causal inference. The Annals of Statistics, pp. 436-463. · Zbl 1267.62068
[29] Zhang, J.; Spirtes, P., The three faces of faithfulness, Synthese, 193, 1011-1027 (2016) · Zbl 1528.03141 · doi:10.1007/s11229-015-0673-9
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.