On the definition of a confounder. (English) Zbl 1347.62017

Summary: The causal inference literature has provided a clear formal definition of confounding expressed in terms of counterfactual independence. The literature has not, however, come to any consensus on a formal definition of a confounder, as it has given priority to the concept of confounding over that of a confounder. We consider a number of candidate definitions arising from various more informal statements made in the literature. We consider the properties satisfied by each candidate definition, principally focusing on (i) whether under the candidate definition control for all “confounders” suffices to control for “confounding” and (ii) whether each confounder in some context helps eliminate or reduce confounding bias. Several of the candidate definitions do not have these two properties. Only one candidate definition of those considered satisfies both properties. We propose that a “confounder” be defined as a pre-exposure covariate \(C\) for which there exists a set of other covariates \(X\) such that effect of the exposure on the outcome is unconfounded conditional on \((X,C)\) but such that for no proper subset of \((X,C)\) is the effect of the exposure on the outcome unconfounded given the subset. We also provide a conditional analogue of the above definition; and we propose a variable that helps reduce bias but not eliminate bias be referred to as a “surrogate confounder.” These definitions are closely related to those given by J. M. Robins and H. Morgenstern [Comput. Math. Appl. 14, 869–916 (1987; Zbl 0647.62093)]. The implications that hold among the various candidate definitions are discussed.


62A01 Foundations and philosophical topics in statistics
62J99 Linear inference, regression
62P10 Applications of statistics to biology and medical sciences; meta analysis


Zbl 0647.62093


Full Text: DOI arXiv Euclid


[1] Barnow, B. S., Cain, G. G. and Goldberger, A. S. (1980). Issues in the analysis of selectivity bias. In Evaluation Studies (E. Stromsdorfer and G. Farkas, eds.) 5 . Sage, San Francisco.
[2] Breslow, N. E. and Day, N. E. (1980). Statistical Methods in Cancer Research , Vol. 1: The Analysis of Case-Control Studies . International Agency for Research on Cancer, Lyon, France.
[3] Cox, D. R. (1958). Planning of Experiments . Wiley, New York. · Zbl 0084.15802
[4] Dawid, A. P. (2002). Influence diagrams for causal modeling and inference. Int. Statist. Rev. 70 161-189. · Zbl 1215.62002
[5] Geng, Z., Guo, J. and Fung, W.-K. (2002). Criteria for confounders in epidemiological studies. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 3-15. · Zbl 1015.62113
[6] Geng, Z. and Li, G. (2002). Conditions for non-confounding and collapsibility without knowledge of completely constructed causal diagrams. Scand. J. Stat. 29 169-181. · Zbl 1017.62112
[7] Geng, Z., Guo, J., Lau, T. S. and Fung, W.-K. (2001). Confounding, homogeneity and collapsibility for causal effects in epidemiologic studies. Statist. Sinica 11 63-75. · Zbl 1057.62536
[8] Glymour, M. M. and Greenland, S. (2008). Causal diagrams. In Modern Epidemiology , 3rd ed. (K. J. Rothman, S. Greenland and T. L. Lash, eds.) 12 . Lippincott Williams and Wilkins, Philadelphia, PA.
[9] Greenland, S. (2003). Quantifying biases in causal models: Classical confounding versus collider-stratification bias. Epidemiology 14 300-306.
[10] Greenland, S. and Morgenstern, H. (2001). Confounding in health research. Annual Rev. Public Health 22 189-212.
[11] Greenland, S., Pearl, J. and Robins, J. M. (1999). Causal diagrams for epidemiologic research. Epidemiology 10 37-48. · Zbl 1059.62506
[12] Greenland, S. and Pearl, J. (2007). Causal diagrams. In Encyclopedia of Epidemiology (S. Boslaugh, ed.) 149-156. Sage, Thousand Oaks, CA.
[13] Greenland, S. and Pearl, J. (2011). Adjustments and their consequences-collapsibility analysis using graphical models. International Statistical Review 79 401-426. · Zbl 1238.62070
[14] Greenland, S. and Robins, J. M. (1986). Identifiability, exchangeability, and epidemiological confounding. Int. J. Epidemiol. 15 413-419.
[15] Greenland, S., Robins, J. M. and Pearl, J. (1999). Confounding and collapsibility in causal inference. Statist. Sci. 14 29-46. · Zbl 1059.62506
[16] Greenland, S. and Robins, J. M. (2009). Identifiability, exchangeability and confounding revisited. Epidemiol. Perspect. Innov. 6 4.
[17] Hernán, M. A. (2008). Confounding. In Encyclopedia of Quantitative Risk Assessment and Analysis (B. Everitt and E. Melnick, eds.) 353-362. Wiley, Chichester, UK.
[18] Hernán, M. A., Hernánez-Díaz, S., Werler, M. M. and Mitchell, A. A. (2002). Causal knowledge as a prerequisite for confounding evaluation: An application to birth defects epidemiology. American Journal of Epidemiology 155 176-184.
[19] Imbens, G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. Rev. Econom. Statist. 86 4-29.
[20] Kleinbaum, D. G., Kupper, L. L. and Morgenstern, H. (1982). Epidemiologic Research : Principles and Quantitative Methods . Lifetime Learning Publications [Wadsworth], Belmont, CA.
[21] Lauritzen, S. L. (1996). Graphical Models . Oxford Univ. Press, New York. · Zbl 0907.62001
[22] Miettinen, O. S. (1974). Confounding and effect modification. Am. J. Epidemiol. 100 350-353.
[23] Miettinen, O. S. (1976). Stratification by a multivariate confounder score. Am. J. Epidemiol. 104 609-620.
[24] Miettinen, O. S. and Cook, E. F. (1981). Confounding: Essence and detection. Am. J. Epidemiol. 114 593-603.
[25] Morabia, A. (2011). History of the modern epidemiological concept of confounding. J. Epidemiol. Community Health 65 297-300.
[26] Neyman, J. (1923). Sur les applications de la thar des probabilities aux experiences Agaricales: Essay des principle. Excerpts reprinted (1990) in English (D. Dabrowska and T. Speed, trans.). Statist. Sci. 5 463-472.
[27] Ogburn, E. L. and VanderWeele, T. J. (2012). On the nondifferential misclassification of a binary confounder. Epidemiology 23 433-439.
[28] Pearl, J. (1995). Causal diagrams for empirical research. Biometrika 82 669-710. · Zbl 0860.62045
[29] Pearl, J. (2009). Causality : Models , Reasoning , and Inference , 2nd ed. Cambridge Univ. Press, Cambridge. · Zbl 1188.68291
[30] Robins, J. (1992). Estimation of the time-dependent accelerated failure time model in the presence of confounding factors. Biometrika 79 321-334. · Zbl 0753.62076
[31] Robins, J. M. (1997). Causal inference from complex longitudinal data. In Latent Variable Modeling and Applications to Causality ( Los Angeles , CA , 1994) (M. Berkane, ed.). Lecture Notes in Statistics 120 69-117. Springer, New York. · Zbl 0969.62072
[32] Robins, J. M. and Greenland, S. (1986). The role of model selection in causal inference from nonexperimental data. Am. J. Epidemiol. 123 392-402.
[33] Robins, J. M. and Morgenstern, H. (1987). The foundations of confounding in epidemiology. Comput. Math. Appl. 14 869-916. · Zbl 0647.62093
[34] Robins, J. M. and Richardson, T. S. (2010). Alternative graphical causal models and the identification of direct effects. In Causality and Psychopathology : Finding the Determinants of Disorders and Their Cures (P. E. Shrout, K. M. Keyes and K. Ornstein, eds.) 103-158. Oxford Univ. Press, New York.
[35] Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41-55. · Zbl 0522.62091
[36] Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34-58. · Zbl 0383.62021
[37] Rubin, D. B. (1990). Formal modes of statistical inference for causal effects. J. Statist. Plann. Inference 25 279-292.
[38] Shpitser, I., VanderWeele, T. J. and Robins, J. M. (2010). On the validity of covariate adjustment for estimating causal effects. In Proceedings of the 26 th Conference on Uncertainty and Artificial Intelligence 527-536. AUAI Press, Corvallis, OR.
[39] Spirtes, P., Glymour, C. and Scheines, R. (1993). Causation , Prediction , and Search. Lecture Notes in Statistics 81 . Springer, New York. · Zbl 0806.62001
[40] VanderWeele, T. J. (2012). Confounding and effect modification: Distribution and measure. Epidemiologic Methods 1 55-82. · Zbl 1277.62266
[41] VanderWeele, T. J. and Shpitser, I. (2011). A new criterion for confounder selection. Biometrics 67 1406-1413. · Zbl 1274.62890
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.