×

An overview of recent advancements in causal studies. (English) Zbl 1364.05032

Summary: In causal study we are interested in finding the graphical structure in the form of directed acyclic graphs (DAGs). These DAGs describe the directions and connection strength to connecting variables represented by nodes. In this regard, various methods have been developed to estimate the appropriate structure of the causal model and to explain a fair number of its features. Our review aims to provide a complete and systematic analysis of selected articles from past few decades, having powerful methods to infer the area of study. In this article, we categorized all selected articles in three groups, on the basis of techniques these used to construct the causal model. To provide a full comparative study under categories of probabilistic, statistical and algebraic approaches, we discussed underlying difficulties, limitations, merits and disadvantages in applying these techniques. The reader will find it helpful to choose and use the appropriate method for a better implication.

MSC:

05C20 Directed graphs (digraphs), tournaments
05C90 Applications of graph theory
05-02 Research exposition (monographs, survey articles) pertaining to combinatorics
62-02 Research exposition (monographs, survey articles) pertaining to statistics
62A99 Foundational topics in statistics

Software:

TETRAD; DirectLiNGAM
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bach FR, Jordan MI (2002) Kernel independent component analysis. J Mach Learn Res 3:1-48 · Zbl 1088.68689
[2] Beal MJ, Ghahramani Z (2004) Variational Bayesian learning of directed graphical models with hidden variables. Bayesian Anal 1:1-44
[3] Bollen KA (1989) Structural equations with latent variables. Wiley, New York · Zbl 0731.62159 · doi:10.1002/9781118619179
[4] Borgelt C (2010) A conditional independence algorithm for learning undirected graphical models. J Comput Syst Sci 76:21-33 · Zbl 1186.68356 · doi:10.1016/j.jcss.2009.05.003
[5] Cichocki A, Amari SI (2002) Adaprive blind signal and image processing. Wiley, New York · doi:10.1002/0470845899
[6] Chwialkowski K, Gretton A (2014) A kernel independence test for random processes. In: Proceedings of the 31st international conference on machine learning. arXiv:1402.4501 · Zbl 1186.68356
[7] Eliden G, Nachman I, Friedman N (2007) “Ideal Parent” structure learning for continuous variable Bayesian networks. J Mach Learn Res 8:1799-1833 · Zbl 1222.68191
[8] Freiedman N, Koller D (2003) Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks. Mach Learn 50:95-125 · Zbl 1033.68104 · doi:10.1023/A:1020249912095
[9] Fukumizu K, Bach FR, Gretton A (2007) Statistical consistency of kernel canonical correlation analysis. J Mach Learn Res 8:361-383 · Zbl 1222.62063
[10] Geng Z, He YB, Wang XL, Zhao Q (2003) Bayesian method for learning graphical models with incompletely categorical data. Comput Stat Data Anal 44:175-192 · Zbl 1429.62037 · doi:10.1016/S0167-9473(03)00066-5
[11] Gretton A, Bousquet O, Smol AJ, Schölkopf B (2005) Measuring statistical dependence with Hilbert-Schmidt norms. In: Algorithmic learning theory: 16th international conference (ALT2005), vol 3734, pp 63-77 · Zbl 1168.62354
[12] Gretton A, Herbrich R, Smola A (2003) The kernel mutual information. In: Proceedings of IEEE internaltional conference on acoustics, speech and signal processing (ICASSP 2003), pp 880-883 · Zbl 1151.30007
[13] Gretton A, Herbrich R, Smola A, Bousquet O, Schölkopf B (2005) Kernel methods for measuring independence. J Mach Learn Res 6:2075-2129 · Zbl 1222.68208
[14] Gretton A, Smola A, Bousquet O, Herbrich R, Belitski A, Augath M, Murayama Y, Pauls J, Schölkopf B, Logothetis N (2005) Kernel constrained covariance for dependence measurement. In AISTATS 10, pp 112-119 · Zbl 1225.68184
[15] Gretton A, Fukumizu K, Teo CH, Song L, Schölkopf B, Smola AJ, Koller D, Singer Y, Roweis S (2007) A kernel statistical test of independence. In: Twenty-First Annual Conference on Neural Information Processing Systems (NIPS 2007), Curran, pp 585-592 · Zbl 1248.68490
[16] He YB, Geng Z (2008) Active learning of causal networks with intervention experiments and optimal designs. J Mach Learn Res 9:2523-2547 · Zbl 1225.68184
[17] Hofmann T, Schölkopf B, Smola AJ (2008) Kernel methods in machine learning. Ann Stat 36(3):1171-1220. doi:10.1214/009053607000000677 · Zbl 1151.30007 · doi:10.1214/009053607000000677
[18] Hoyer PO, Janzing D, Mooij J, Peters J, Schölkopf B (2009) Nonlinear causal discovery with additive noise models. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Advances in neural information processing systems 21 : 22nd Annual Conference on Neural Information Processing Systems 2008, Red Hook, NY, Curran, pp 689-696 · Zbl 1225.68205
[19] Hyvärinen A, Smith SM (2013) Pairwise likelihood ratios for estimation of non-Gaussian structural equation models. J Mach Learn Res 14:111-152 · Zbl 1307.68069
[20] Janzing D, Mooij J, Zhang K, Lemeire J, Zscheischler J, Daniuis P, Steudel B, Schölkopf B (2012) Information-geometric approach to inferring causal directions. Artif Intell 182-183:1-31 · Zbl 1248.68490 · doi:10.1016/j.artint.2012.01.002
[21] Gilks WR, Richardson S, Spiegelhalter DJ (1996) Introducing markov chain monte carlo. Markov Chain Monte Carlo Pract 1:19 · Zbl 0845.60072
[22] Pearl J (1998) Probabilistic reasoning in intelligent systems. Morgan Kaufmann, San Francisco. ISBN 0-934613-73-7 · Zbl 0746.68089
[23] Pearl J (2000) Causality: models, reasoning, and inference, 2nd edn. Cambridge University Press, Cambridge · Zbl 0959.68116
[24] Pearl J (2009) Causal inference in statistics: an overview. Stat Surv 3:96-146. ISSN: 1935-7516. DOI:10.1214/09-SS057 · Zbl 1300.62013
[25] Pellet JP, Elisseeff A (2008) Using Markov blankets for causal structure learning. J Mach Learn Res 9:1295-1342 · Zbl 1225.68205
[26] Petrović L, Dimitrijević S (2011) Invariance of statistical causality under convergence. Stat Probab Lett 81(9):1445-1448 · Zbl 1218.62005 · doi:10.1016/j.spl.2011.04.021
[27] Roy S, Lane T, Washburne MW (2009) Learning structurally consistent undirected probabilistic graphical models. In: Proceedings of the 26th annual international conference on machine learning, pp 905-912. ACM · Zbl 1280.68195
[28] Shimizu S, Hoyer PO, Hyvärinen A, Kerminen A (2006) A linear non-Gaussian acyclic model for causal discovery. J Mach Learn Res 7:2003-2030 · Zbl 1222.68304
[29] Shimizu S, Inazumi T, Sogawa Y, Hyvärinen A, Kawahara Y, Washio T, Hoyer PO, Bollen K (2011) DirectLiNGAM: a direct method for learning a linear non-Gaussian structural equation model. J Mach Learn Res 12:1225-1248 · Zbl 1280.68195
[30] Shpitser I, Pearl J (2008) Complete identification methods for the causal hierarchy. J Mach Learn Res 9:1941-1979 · Zbl 1225.68216
[31] Song L, Smola A, Gretton A, Bedo J, Borgwardt K (2012) Feature selection via dependence maximization. J Mach Learn Res 13(1):1393-1434 · Zbl 1303.68110
[32] Spirtes P, Glymour C, Scheines R (1993) Causation, prediction, and search, 2nd edn. Springer, New york · Zbl 0806.62001 · doi:10.1007/978-1-4612-2748-9
[33] Sun X, Janzing D, Schölkopf B (2006) Causal inference by choosing graphs with most plausible Markov kernels. In: Proceeding of the 9th international symposium on artificial intelligence and mathematics, Fort Lauderdale, Florida
[34] Sun X, Janzing D, Schölkopf B, Fukumizu K (2007) A kernel-based causal learning algorithm. In Proceedings of the 24th international conference on Machine learning, pp 855-862
[35] Uhler C, Raskutti G, Bühlmann P, Yu B (2013) Geometry of the faithfulness assumption in causal inference. Ann Stat 41(2):436-463. doi:10.1214/12-AOS1080 · Zbl 1267.62068 · doi:10.1214/12-AOS1080
[36] Zhang J (2008) Causal reasoning with ancestral graphs. J Mach Learn Res 9:1437-1474 · Zbl 1225.68254
[37] Zhang K, Hyvärinen A (2008) Distinguishing causes from effects using nonlinear acyclic causal models. In: Journal of machine learning research, workshop and conference proceedings (NIPS 2008 causality workshop), vol. 6, pp 157-164 · Zbl 1225.68254
[38] Zhang K, Hyvärinen A (2009) On the identifiability of the post-nonlinear causal model. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence, pp 647-655. AUAI Press
[39] Zhang X, Song L, Gretton A, Smola AJ (2009) Kernel measures of independence for non-iid data. Adv Neural Inf Process Syst 21:1937-1944 · Zbl 1255.83082
[40] Zhang K, Peters J, Janzing D, Schölkopf B (2012) Kernel-based conditional independence test and application in causal discovery. arXiv:1202.3775 · Zbl 1222.68304
[41] Zwald L, Bousquet O, Blanchard G (2004) Statistical properties of kernel principal component analysis. In: Proceedings of 17th annual conference on learning theory (COLT 2004), pp 594-608 · Zbl 1078.68133
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.