##
**Split-door criterion: identification of causal effects through auxiliary outcomes.**
*(English)*
Zbl 1412.62207

Summary: We present a method for estimating causal effects in time series data when fine-grained information about the outcome of interest is available. Specifically, we examine what we call the split-door setting, where the outcome variable can be split into two parts: one that is potentially affected by the cause being studied and another that is independent of it, with both parts sharing the same (unobserved) confounders. We show that under these conditions, the problem of identification reduces to that of testing for independence among observed variables, and propose a method that uses this approach to automatically find subsets of the data that are causally identified. We demonstrate the method by estimating the causal impact of Amazon’s recommender system on traffic to product pages, finding thousands of examples within the dataset that satisfy the split-door criterion. Unlike past studies based on natural experiments that were limited to a single product category, our method applies to a large and representative sample of products viewed on the site. In line with previous work, we find that the widely-used click-through rate (CTR) metric overestimates the causal impact of recommender systems; depending on the product category, we estimate that 50–80% of the traffic attributed to recommender systems would have happened even without any recommendations. We conclude with guidelines for using the split-door criterion as well as a discussion of other contexts where the method can be applied.

### MSC:

62P20 | Applications of statistics to economics |

62M10 | Time series, auto-correlation, regression, etc. in statistics (GARCH) |

### Keywords:

causal inference; data mining; causal graphical model; natural experiment; recommendation systems; time series
PDFBibTeX
XMLCite

\textit{A. Sharma} et al., Ann. Appl. Stat. 12, No. 4, 2699--2733 (2018; Zbl 1412.62207)

### References:

[1] | Agresti, A. (1992). A survey of exact inference for contingency tables. Statist. Sci.7 131-177. · Zbl 0955.62587 · doi:10.1214/ss/1177011454 |

[2] | Agresti, A. (2001). Exact inference for categorical data: Recent advances and continuing controversies. Stat. Med.20 2709-2722. |

[3] | Angrist, J. D., Imbens, G. W. and Rubin, D. B. (1996). Identification of causal effects using instrumental variables. J. Amer. Statist. Assoc.91 444-455. · Zbl 0897.62130 · doi:10.1080/01621459.1996.10476902 |

[4] | Carnegie, N. B., Harada, M. and Hill, J. L. (2016). Assessing sensitivity to unmeasured confounding using a simulated potential confounder. Journal of Research on Educational Effectiveness9 395-420. |

[5] | Cattaneo, M. D., Frandsen, B. R. and Titiunik, R. (2015). Randomization inference in the regression discontinuity design: An application to party advantages in the US Senate. Journal of Causal Inference3 1-24. |

[6] | Cattaneo, M. D., Titiunik, R. and Vazquez-Bare, G. (2017). Comparing inference approaches for RD designs: A reexamination of the effect of head start on child mortality. Journal of Policy Analysis and Management36 643-681. |

[7] | de Siqueira Santos, S., Takahashi, D. Y., Nakata, A. and Fujita, A. (2014). A comparative study of statistical methods used to identify dependencies between gene expression signals. Brief. Bioinform.15 906-918. |

[8] | Delongchamp, R. R., Bowyer, J. F., Chen, J. J. and Kodell, R. L. (2004). Multiple-testing strategy for analyzing cDNA array data on gene expression. Biometrics60 774-782. · Zbl 1274.62755 · doi:10.1111/j.0006-341X.2004.00228.x |

[9] | Dunning, T. (2012). Natural Experiments in the Social Sciences: A Design-Based Approach. Cambridge Univ. Press, Cambridge. |

[10] | Farcomeni, A. (2008). A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion. Stat. Methods Med. Res.17 347-388. · Zbl 1156.62048 · doi:10.1177/0962280206079046 |

[11] | Fiske, S. T. and Hauser, R. M. (2014). Protecting human research participants in the age of big data. Proc. Natl. Acad. Sci. USA111 13675-13676. |

[12] | Flaxman, S., Goel, S. and Rao, J. M. (2016). Filter bubbles, echo chambers, and online news consumption. Public Opin. Q.80 298-320. |

[13] | Grau, J. (2009). Personalized product recommendations: Predicting shoppers’ needs. EMarketer. |

[14] | Grosse-Wentrup, M., Janzing, D., Siegel, M. and Schölkopf, B. (2016). Identification of causal relations in neuroimaging data with latent confounders: An instrumental variable approach. NeuroImage125 825-833. |

[15] | Harding, D. J. (2009). Collateral consequences of violence in disadvantaged neighborhoods. Soc. Forces88 757-784. |

[16] | Imbens, G. W. (2010). Better LATE than nothing. J. Econ. Lit.48. |

[17] | Imbens, G. W. and Rubin, D. B. (2015). Causal Inference—For Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge Univ. Press, New York. · Zbl 1355.62002 |

[18] | Jensen, D. D., Fast, A. S., Taylor, B. J. and Maier, M. E. (2008). Automatic identification of quasi-experimental designs for discovering causal knowledge. In Proceedings of the 14th ACM International Conference on Knowledge Discovery and Data Mining 372-380. |

[19] | Kang, H., Zhang, A., Cai, T. T. and Small, D. S. (2016). Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization. J. Amer. Statist. Assoc.111 132-144. |

[20] | Lewis, R. A., Rao, J. M. and Reiley, D. H. (2011). Here, there, and everywhere: Correlated online behaviors can lead to overestimates of the effects of advertising. In Proceedings of the 20th International Conference on World Wide Web 157-166. ACM, New York. |

[21] | Liang, K. and Nettleton, D. (2012). Adaptive and dynamic adaptive procedures for false discovery rate control and estimation. J. R. Stat. Soc. Ser. B. Stat. Methodol.74 163-182. · Zbl 1411.62226 |

[22] | Lydersen, S., Pradhan, V., Senchaudhuri, P. and Laake, P. (2007). Choice of test for association in small sample unordered \(r× c\) tables. Stat. Med.26 4328-4343. |

[23] | Mealli, F. and Pacini, B. (2013). Using secondary outcomes to sharpen inference in randomized experiments with noncompliance. J. Amer. Statist. Assoc.108 1120-1131. · Zbl 06224991 · doi:10.1080/01621459.2013.802238 |

[24] | Morgan, S. L. and Winship, C. (2014). Counterfactuals and Causal Inference. Cambridge Univ. Press, Cambridge. |

[25] | Mulpuru, S. (2006). What you need to know about third-party recommendation engines. Forrester Research. |

[26] | Nettleton, D., Hwang, J. T. G., Caldo, R. A. and Wise, R. P. (2006). Estimating the number of true null hypotheses from a histogram of p values. J. Agric. Biol. Environ. Stat.11 337. |

[27] | Paninski, L. (2003). Estimation of entropy and mutual information. Neural Comput.15 1191-1253. · Zbl 1052.62003 · doi:10.1162/089976603321780272 |

[28] | Pearl, J. (2009). Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge Univ. Press, Cambridge. · Zbl 1188.68291 |

[29] | Pethel, S. D. and Hahs, D. W. (2014). Exact test of independence using mutual information. Entropy16 2839-2849. |

[30] | Phan, T. Q. and Airoldi, E. M. (2015). A natural experiment of social network formation and dynamics. Proc. Natl. Acad. Sci. USA112 6595-6600. |

[31] | Ricci, F., Rokach, L. and Shapira, B. (2011). Introduction to Recommender Systems Handbook. Springer, Berlin. · Zbl 1214.68392 |

[32] | Rosenbaum, P. R. (2010). Design of Observational Studies. Springer, New York. · Zbl 1308.62005 |

[33] | Rosenzweig, M. R. and Wolpin, K. I. (2000). Natural “natural experiments” in economics. J. Econ. Lit.38 827-874. |

[34] | Rubin, D. B. (2006). Matched Sampling for Causal Effects. Cambridge Univ. Press, Cambridge. · Zbl 1118.62113 |

[35] | Sharma, A., Hofman, J. M. and Watts, D. J. (2015). Estimating the causal impact of recommendation systems from observational data. In Proceedings of the 16th ACM Conference on Economics and Computation 453-470. |

[36] | Sharma, A., Hofman, J. M and Watts, D. J (2018). Supplement to “Split-door criterion: Identification of causal effects through auxiliary outcomes.” DOI:10.1214/18-AOAS1179SUPP. · Zbl 1412.62207 |

[37] | Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction, and Search, 2nd ed. MIT Press, Cambridge, MA. · Zbl 0806.62001 |

[38] | Steuer, R., Kurths, J., Daub, C. O., Weise, J. and Selbig, J. (2002). The mutual information: Detecting and evaluating dependencies between variables. Bioinformatics18 S231-S240. |

[39] | Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B. Stat. Methodol.64 479-498. · Zbl 1090.62073 · doi:10.1111/1467-9868.00346 |

[40] | Storey, J. D. and Tibshirani, R. (2003). SAM thresholding and false discovery rates for detecting differential gene expression in DNA microarrays. In The Analysis of Gene Expression Data. 272-290. Springer, New York. |

[41] | Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statist. Sci.25 1-21. · Zbl 1328.62007 · doi:10.1214/09-STS313 |

[42] | Székely, G. J., Rizzo, M. L. and Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. Ann. Statist.35 2769-2794. · Zbl 1129.62059 · doi:10.1214/009053607000000505 |

[43] | VanderWeele, T. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.