zbMATH — the first resource for mathematics

A new causal discovery heuristic. (English) Zbl 1395.68239
Summary: Probabilistic methods for causal discovery are based on the detection of patterns of correlation between variables. They are based on statistical theory and have revolutionised the study of causality. However, when correlation itself is unreliable, so are probabilistic methods: unusual data can lead to spurious causal links, while nonmonotonic functional relationships between variables can prevent the detection of causal links. We describe a new heuristic method for inferring causality between two continuous variables, based on randomness and unimodality tests and making few assumptions about the data. We evaluate the method against probabilistic and additive noise algorithms on real and artificial datasets, and show that it performs competitively.
68T05 Learning and adaptive systems in artificial intelligence
62A01 Foundations and philosophical topics in statistics
68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)
Full Text: DOI
[1] Basu, S; DasGupta, A, The mean, Median, and mode of unimodal distributions: A characterization, Theory Probab. Appl., 41, 210-223, (1997) · Zbl 0881.60011
[2] Black, SE, Do better schools matter? parental valuation of elementary education, Q. J. Econ., 114, 577-599, (1999)
[3] Buehlmann, P; Peters, J; Ernest, J, CAM: causal additive models, high-dimensional order search and penalized regression, Ann. Stat., 42, 2526-2556, (2014) · Zbl 1309.62063
[4] Bunge, M.: Causality and Modern Science. Transaction Publishers (2009) · JFM 52.0532.04
[5] Chay, KY; Greenstone, M, Does air quality matter? evidence from the housing market, J. Polit. Econ., 113, 376-424, (2005)
[6] Chiodo, AJ; Hernandez-Murillo, R; Owyang, MT, Nonlinear effects of school quality on house prices, Federal Reserve Bank St. Louis Rev., 92, 185-204, (2010)
[7] Cooper, GF, The computational complexity of probabilistic inference using Bayesian belief networks, Artificial Intelligence, 42, 393-405, (1990) · Zbl 0717.68080
[8] Cortez, P; Cerdeira, A; Almeida, F; Matos, T; Reis, J, Modeling wine preferences by data mining from physicochemical properties, Decis. Support. Syst., 47, 547-553, (2009)
[9] Currie, J; Davis, L; Greenstone, M; Walker, R, Environmental health risks and housing values: evidence from 1,600 toxic plant openings and closings, Am. Econ. Rev., 105, 678-709, (2015)
[10] Daniušis, P., Janzing, D., Mooij, J.M., Zscheischler, J., Steudel, B., Zhang, K., Schölkopf, B.: Inferring deterministic causal relations. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pp. 143-150 (2010)
[11] Fukumizu, K., Gretton, A., Sun, X., Schoelkopf, B.: Kernel measures of conditional dependence. In: Proceedings of the 20th International Conference on Advances in Neural Information Processing Systems, pp. 489-496. MIT Press (2007) · Zbl 1222.68304
[12] Granger, CW, Investigating causal relations by econometric models and cross-spectral methods, Econometrica, 37, 424-438, (1969) · Zbl 1366.91115
[13] Guyon, I., Aliferis, C., Elisseeff, A.: Causal feature selection. In: Liu, H., Motoda, H. (eds.) Computational Methods of Feature Selection. Chapman and Hall/CRC (2007)
[14] Harrison, D; Rubinfeld, DL, Hedonic prices and the demand for Clean air, J. Environ. Econ. Manag., 5, 81-102, (1978) · Zbl 0375.90023
[15] Hoover, KD, Nonstationary time series, cointegration, and the principle of the common cause, Brit. J. Phil. Sci., 54, 527-551, (2003) · Zbl 1091.62525
[16] Hoyer, PO; Janzing, D; Mooij, JM; Peters, J; Schölkopf, B, Nonlinear causal discovery with additive noise models, Adv. Neural Inf. Process. Syst., 21, 689-696, (2009) · Zbl 1318.68151
[17] Janzing, D; Mooij, J; Zhang, K; Lemeire, J; Zscheischler, J; Daniušis, P; Steudel, B; Schölkopf, B, Information-geometric approach to inferring causal directions, Artif. Intell., 182-3, 1-31, (2012) · Zbl 1248.68490
[18] Kalisch, M., Maechler, M., Colombo, D., Maathuis, M.H., Buehlmann, P.: Causal inference using graphical models with the R package. J. Statist. Softw., 47(11) (2012)
[19] Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2013)
[20] Margaritis, D.: Distribution-free learning of Bayesian network structure in continuous domains. In: Proceedings of the 20th National Conference on Artificial Intelligence AAAI, pp. 825-830 (2005)
[21] Mooij, J.M., Janzing, D., Zscheischler, J., Schölkopf, B.: CauseEffectPairs repository http://webdav.tuebingen.mpg.de/causality/ (2014)
[22] Mooij, J.M., Peters, J., Janzing, D., Zscheischler, J., B. Schölkopf.: Distinguishing cause from effect using observational data: Methods and benchmarks. Technical Report arXiv:1412.3773v1 Max-Planck-Institute for Intelligent Systems at Tuebingen (2014) · Zbl 1360.68700
[23] Parr, R., Mackay, J.: Secrets of the Sommeliers: How to Think and Drink Like the World’s Top Wine Professionals. Penguin Random House (2010)
[24] Pearl, J.: Causality, Models, Reasoning, and Inference. Cambridge University Press (2000) · Zbl 0959.68116
[25] Peters, J., Ernest, J.: CAM: Causal Additive Model (CAM). R package version 1.0, http://CRAN.R-project.org/package=CAM (2015)
[26] Prestwich, S.D., Tarim, S.A., Ozkan, I.: Causal discovery by randomness test. In: Proceedings of the 14th International Symposium on Artificial Intelligence and Mathematics (2016) · Zbl 1248.68490
[27] R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, http://www.R-project.org/ (2016)
[28] Redmond, MA; Baveja, A, A data-driven software tool for enabling cooperative information sharing among police departments, Eur. J. Oper. Res., 141, 660-678, (2002) · Zbl 1081.68745
[29] Reichenbach, H.: The Direction of Time. University of California Press, Berkeley (1956)
[30] Reiss, J.: Causation, Evidence, and Inference. Routledge (2015)
[31] Salkind, N.J., Rasmussen, K.: Encyclopedia of Measurement and Statistics. SAGE Publications Inc (2007) · Zbl 1366.91115
[32] Shimizu, S; Hoyer, PO; Hyvarinen, A; Kerminen, AJ, A linear non-Gaussian acyclic model for causal discovery, J. Mach. Learn. Res., 7, 2003-2030, (2006) · Zbl 1222.68304
[33] Smith, VK; Huang, JC, Hedonic models and air pollution: twenty-five years and counting, Environ. Resour. Econ., 3, 381-394, (1993)
[34] Sober. E., Venetian sea levels, british bread prices, and the principle of the common cause, Brit. J. Phil. Sci., 52, 331-346, (2001)
[35] Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search. MIT Press, Cambridge (2000) · Zbl 0806.62001
[36] Ben Taieb, S; Hyndman, RJ, A gradient boosting approach to the kaggle load forecasting competition, Int. J. Forecast., 30, 382-394, (2014)
[37] Wald, A; Wolfowitz, J, On a test whether two samples are from the same population, Ann. Math. Statist., 11, 147-162, (1940) · JFM 66.0645.01
[38] You, J, Darpa sets out to automate research, Science, 347, 465, (2015)
[39] Yule, G, Why do we sometimes get nonsense-correlations between time series?, J. R. Stat. Soc., 89, 1-64, (1926) · JFM 52.0532.04
[40] Zahirovic-Herbert, V; Turnbull, GK, School quality, house prices and liquidity, J. Real Estate Financ. Econ., 37, 113-130, (2008)
[41] Zhang, K., Hyvarinen, A.: On the Identifiability of the Post-Nonlinear Causal model. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pp. 647-655 (2009)
[42] Zhang, K., Peters, J., Janzing, D., Schoelkopf, B.: Kernel-based conditional independence test and application in causal discovery. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (2011) · Zbl 0881.60011
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.