For objective causal inference, design trumps analysis. (English) Zbl 1149.62089

Summary: For obtaining causal inferences that are objective, and therefore have the best chance of revealing scientific truths, carefully designed and executed randomized experiments are generally considered to be the gold standard. Observational studies, in contrast, are generally fraught with problems that compromise any claim for objectivity of the resulting causal inferences. The thesis here is that observational studies have to be carefully designed to approximate randomized experiments, in particular, without examining any final outcome data. Often a candidate data set will have to be rejected as inadequate because of lack of data on key covariates, or because of lack of overlap in the distributions of key covariates between treatment and control groups, often revealed by careful propensity score analyses. Sometimes the template for the approximating randomized experiment will have to be altered, and the use of principal stratification can be helpful in doing this. These issues are discussed and illustrated using the framework of potential outcomes to define causal effects, which greatly clarifies critical issues.


62P10 Applications of statistics to biology and medical sciences; meta analysis
62P99 Applications of statistics
Full Text: DOI arXiv


[1] Ahmed, A., Husain, A., Love, T., Gambassi, G., Dell’Italia, L., Francis, G., Gheorghiade, M., Allman, R., Meleth, S. and Bourge, R. (2006). Heart failure, chronic diuretic use, and increase in mortality and hospitalization: An observational study using propensity score methods. Eur. Heart J. 27 1431-1439.
[2] Angrist, J., Imbens, G. and Rubin, D. (1996). Identification of causal effects using instrumental variables. J. Amer. Statist. Assoc. 91 444-472. · Zbl 0897.62130 · doi:10.2307/2291629
[3] Barnard, J., Frangakis, C., Hill, J. and Rubin, D. (2003). Principal stratification approach to broken randomized experiments: A case study of school choice vouchers in New York city. J. Amer. Statist. Assoc. 98 299-323. · Zbl 1047.62120 · doi:10.1198/016214503000071
[4] Blalock, H. (1964). Causal Inference in Nonexperimental Research . Univ. North Carolina Press, Chapel Hill.
[5] Campbell, D. and Stanley, J. (1963). Experimental and quasi-experimental designs for research and teaching. In Handbook of Research on Teaching (N. L. Gage, ed.). Rand McNally, Chicago.
[6] Cochran, W. (1965). The planning of observational studies of human populations. J. Roy. Statist. Soc. A 128 234-265.
[7] Cochran, W. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics 2 295-313.
[8] Cochran, W. (1983). Planning and Analysis of Observational Studies . Wiley, New York.
[9] Cochran, W. and Cox, G. (1950). Experimental Designs . Wiley, New York. · Zbl 0850.62005
[10] Cook, T. and Campbell, D. (1979). Quasi-Experimentation : Design and Analysis for Field Settings . Rand McNally, Chicago.
[11] Cox, D. (1958). The Planning of Experiments . Wiley, New York. · Zbl 0084.15802
[12] D’Agostino, R. Jr. and D’Agostino, R. Sr. (2007). Estimating treatment effects using observational data. J. Amer. Med. Assoc. 297 314-316.
[13] Dorn, H. (1953). Philosophy of inference from retrospective studies. Amer. J. Publ. Health 43 677-683.
[14] Fisher, R. (1925). Statistical Methods for Research Workers . Oliver and Boyd, Edinburgh.
[15] Fisher, R. (1935). Design of Experiments . Oliver and Boyd, Edinburgh. · Zbl 0011.03205
[16] Frangakis, C. and Rubin, D. (2002). Principal stratification in causal inference. Biometrics 58 21-29. · Zbl 1209.62288 · doi:10.1111/j.0006-341X.2002.00021.x
[17] Haavelmo, T. (1944). The probability approach in econometrics. Econometrica 15 413-419. · Zbl 0063.01837
[18] Holland, P. (1986). Statistics and causal inference. J. Amer. Statist. Assoc. 81 945-960. · Zbl 0607.62001 · doi:10.2307/2289064
[19] Holland, P. (1988). Causal inference, path analysis, and recursive structural equations models. Sociological Methodology 18 449-484.
[20] Holland, P. and Rubin, D. (1983). On Lord’s paradox. Principles of Modern Psychological Measurement : A Festschrift for Frederick Lord 3-25. Erlbaum, New Jersey.
[21] Imbens, G. and Rubin, D. (1997). Bayesian inference for causal effects in randomized experiments with noncompliance. Ann. Statist. 25 305-327. · Zbl 0877.62005 · doi:10.1214/aos/1034276631
[22] Imbens, G. and Rubin, D. (2008a). Rubin causal model. The New Palgrave Dictionary of Economics (S. Durlauf and C. Blume, eds.), 2nd ed. Palgrave McMillan, New York.
[23] Imbens, G. and Rubin, D. (2008b). Causal Inference in Statistics , and in the Social and Biomedical Sciences . Cambridge Univ. Press, New York.
[24] Jin, H. and Rubin, D. (2008). Principal stratification for causal inference with extended partial compliance: Application to Efron-Feldman data. J. Amer. Statist. Assoc. 103 101-111. · Zbl 1469.62371
[25] Kempthorne, O. (1952). The Design and Analysis of Experiments . Wiley, New York. · Zbl 0049.09901
[26] Langenskold, S. and Rubin, D. (2008). Outcome-free design of observational studies with application to investigating peer effects on college freshman smoking behaviors. In Les Annales d’Economie et de Statistique .
[27] Kenny, D. A. (1979). Correlation and Causation . Wiley, New York. · Zbl 0504.62109
[28] Lilienfeld, A. and Lilienfeld, D. (1976). Foundations of Epidemiology . Oxford Univ. Press, New York.
[29] Maddala, G. (1977). Econometrics . McGraw-Hill, New York. · Zbl 0385.62083
[30] Morgan, S. L. and Winship, C. (2007). Counterfactuals and Causal Inference : Methods and Principles for Social Research . Cambridge Univ. Press, Cambridge.
[31] Neyman, J. (1923). On the application of probability theory to agricultural experiments: Essay on principles, Section 9. Translated in Statist. Sci. 5 465-480. · Zbl 0955.01560
[32] Neyman, J. (1990). On the application of probability theory to agricultural experiments: Essay on principles, Section 9. Ann. Agric. Sci. 1923. Translated in Statist. Sci. 5 465-472. · Zbl 0955.01560
[33] Reinisch, L., Sanders, S., Mortensen, E. and Rubin, D. (1995). In utero exposure to phenobarbital and intelligence deficits in adult men. J. Amer. Med. Assoc. 274 1518-1525.
[34] Rosenbaum, P. and Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41-55. · Zbl 0522.62091 · doi:10.1093/biomet/70.1.41
[35] Rosenbaum, P. and Rubin, D. (1985). Constructing a control group using multivariate matched sampling incorporating the propensity core. Amer. Statist. 39 33-38.
[36] Rothman, K. J. (1986). Modern Epidemiology . Little, Brown and Company, Boston.
[37] Roy, A. (1951). Some thoughts on the distribution of earnings. Oxford Economic Papers 3 135-146.
[38] Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66 688-701.
[39] Rubin, D. (1975). Bayesian inference for causality: The importance of randomization. In The Proceedings of the Social Statistics Section of the American Statistical Association 233-239. American Statistical Association, Alexandria, VA.
[40] Rubin, D. (1976a). Inference and missing data. Biometrika 63 581-592. With discussion and reply. · Zbl 0344.62034 · doi:10.1093/biomet/63.3.581
[41] Rubin, D. (1976b). Multivariate matching methods that are equal percent bias reducing, II: Maximums on bias reduction for fixed sample sizes. Biometrics 32 121-132. · Zbl 0326.62044 · doi:10.2307/2529343
[42] Rubin, D. (1977). Assignment to treatment group on the basis of a covariate. J. Educ. Statist. 2 1-26.
[43] Rubin, D. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34-58. · Zbl 0383.62021 · doi:10.1214/aos/1176344064
[44] Rubin, D. (1979a). Discussion of “Conditional independence in statistical theory” by A.P. Dawid. J. Roy. Statist. Soc. Ser. B 41 27-28.
[45] Rubin, D. (1979b). Using multivariate matched sampling and regression adjustment to control bias in observational studies. J. Amer. Statist. Assoc. 74 318-328. · Zbl 0413.62047 · doi:10.2307/2286330
[46] Rubin, D. (1980). Discussion of “Randomization analysis of experimental data in the Fisher randomization test” by Basu. J. Amer. Statist. Assoc. 75 591-593.
[47] Rubin, D. (1984). William G. Cochran’s contributions to the design, analysis, and evaluation of observational studies. In W. G. Cochran’s Impact on Statistics (P. S. R. S. Rao and J. Sedransk, eds.) 37-69. Wiley, New York.
[48] Rubin, D. (1990a). Neyman (1923) and causal inference in experiments and observational studies. Statist. Sci. 5 472-480. · Zbl 0955.01559
[49] Rubin, D. (1990b). Formal modes of statistical inference for causal effects. J. Statist. Plann. Inference 25 279-292.
[50] Rubin, D. (1997). Estimating causal effects from large data sets using propensity scores. Ann. Internal Med. 127 757-763.
[51] Rubin, D. (2002). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Serv. and Outcomes Res. Methodol. 2 169-188.
[52] Rubin, D. (2005). Causal inference using potential outcomes: Design, modeling, decisions. 2004 Fisher lecture. J. Amer. Statist. Assoc. 100 322-331. · Zbl 1117.62418 · doi:10.1198/016214504000001880
[53] Rubin, D. (2006). Matched Sampling for Causal Effects . Cambridge Univ. Press, New York. · Zbl 1118.62113
[54] Rubin, D. (2007). The design versus the analysis of observational studies for causal effects: Parallels with the design of randomized trials. Stat. Med. 26 20-30.
[55] Rubin, D. (2008). Statistical inference for causal effects, with emphasis on applications in epidemiology and medical statistics. II. In Handbook of Statisics : Epidemiology and Medical Statistics (C. R. Rao, J. P. Miller and D. C. Rao, eds.). Elsevier, The Netherlands.
[56] Rubin, D. and Thomas, N. (1992). Characterizing the effect of matching using linear propensity score methods with normal covariates. Biometrika 79 797-809. · Zbl 0765.62098 · doi:10.1093/biomet/79.4.797
[57] Rubin, D. and Thomas, N. (2000). Combining propensity score matching with additional adjustments for prognostic covariates. J. Amer. Statist. Assoc. 95 573-585.
[58] Rubin, D., Wang, X., Yin, L. and Zell, E. (2008). Bayesian causal inference: Approaches to estimating the effect of treating hospital type on cancer survival in Sweden using principal stratification. In Handbook of Applied Bayesian Analysis (T. O’Hagan and M. West, eds.). Oxford Univ. Press, Oxford.
[59] Rubin, D. and Waterman, R. (2006). Estimating causal effects of marketing interventions using propensity score methodology. Statist. Sci. 21 206-222. · Zbl 1426.62325
[60] Shadish, W. R., Cook, T. D. and Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference . Houghton Mifflin Company, Boston.
[61] Zell, E., Kuwanda, M. Rubin, D., Cutland, C., Patel, R., Velaphi S., Madhi, S. and Schrag, S. (2007). Conducting and analyzing a single-blind clinical trial in a developing country: Prevention of perinatal sepsis, soweto, South Africa. In Proceedings of the International Statistical Institute (CD-ROM).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.