Peters, Jonas; Bühlmann, Peter; Meinshausen, Nicolai [Thwaites, Peter A.; Didelez, Vanessa; Silva, Ricardo; Dawid, Philip; Foster, Adam; Lauritzen, Steffen; Ding, Peng; Feller, Avi; VanderWeele, Tyler J.; Crudu, Federico; López, Freddy; Porcu, Emilio; Pan, Wenliang; Wen, Canhong; Bareinboim, Elias; Bhattacharya, Debopam; Linton, Oliver; Davison, Andrew; Fine, Jason P.; Hudgens, Michael G.; Hansen, Niels Richard; Kumar, Kuldeep; Lee, Kuang-Yao; Liu, Tianqi; Zhao, Hongyu; Lu, Zudi; Mateu, Jorge; Mooij, Joris M.; Oates, Chris. J.; Kasza, Jessica; Mukherjee, Sach; Richardson, T. S.; Robins, J. M.; Stehlik, Milan; Stehlíková, Silvia; Wang, Linbo; Chen, Shizhe; Shojaie, Ali; Zhao, Qingyuan; Zheng, Charles; Hastie, Trevor; Tibshirani, Robert] Causal inference by using invariant prediction: identification and confidence intervals. With discussion and authors’ reply. (English) Zbl 1414.62297 J. R. Stat. Soc., Ser. B, Stat. Methodol. 78, No. 5, 947-1012 (2016). Summary: What is the difference between a prediction that is made with a causal model and that with a non-causal model? Suppose that we intervene on the predictor variables or change the whole environment. The predictions from a causal model will in general work as well under interventions as for observational data. In contrast, predictions from a non-causal model can potentially be very wrong if we actively intervene on variables. Here, we propose to exploit this invariance of a prediction under a causal model for causal inference: given different experimental settings (e.g. various interventions) we collect all models that do show invariance in their predictive accuracy across settings and interventions. The causal model will be a member of this set of models with high probability. This approach yields valid confidence intervals for the causal relationships in quite general scenarios. We examine the example of structural equation models in more detail and provide sufficient assumptions under which the set of causal predictors becomes identifiable. We further investigate robustness properties of our approach under model misspecification and discuss possible extensions. The empirical properties are studied for various data sets, including large-scale gene perturbation experiments. Cited in 1 ReviewCited in 46 Documents MSC: 62Hxx Multivariate analysis 62-02 Research exposition (monographs, survey articles) pertaining to statistics 62J02 General nonlinear regression 62J05 Linear regression; mixed models 62G10 Nonparametric hypothesis testing Keywords:causal discovery; causal inference; confidence intervals; invariant prediction PDFBibTeX XMLCite \textit{J. Peters} et al., J. R. Stat. Soc., Ser. B, Stat. Methodol. 78, No. 5, 947--1012 (2016; Zbl 1414.62297) Full Text: DOI arXiv