The information in covariate imbalance in studies of hormone replacement therapy. (English) Zbl 1498.62272

Summary: A widely noted failure of causal inference occurred when several observational studies claimed that hormone replacement therapy (HRT) reduced risk of cardiovascular disease; yet, subsequent randomized trials found an increased, not a decreased, cardiovascular risk. We take a close look at covariate imbalances in one of the observational data sets. We use some old, some recent, and some new methods, plus we update an important, simple but largely forgotten suggestion of William Cochran about screening covariates and other variables. In particular, a tapered match shows the impact on all covariates of gradually matching for additional covariates. An exterior match examines the change in the control group as additional covariates are included, and the consequences for outcomes. Because covariates are sometimes continuous, sometimes binary, sometimes ordinal, sometimes missing, we suggest keeping track of magnitudes of aggregate bias in observed covariates using a new estimate of the Kullback-Leibler information between covariate distributions in treated and matched control groups, a flexible measure with several attractive properties. The initial studies ignored some enormous imbalances in socioeconomic covariates that predict the outcomes under study. Our more comprehensive analyses mimic some post-game reanalyses done subsequent to the randomized trials; however, even these omit a large imbalance in a consequential covariate discovered by Cochran’s quick but expansive screening suggestion. Our sense is that a closer examination of covariate imbalance would not have led to a correct conclusion about the effects of HRT, but it would have heightened concerns about the magnitude of the problems in the observational studies, and it would have raised doubts about the ability of a few regression coefficients to eliminate all biases, observed and unobserved, in the comparison. Medical journals need to recognize that certain sources of uncertainty cannot be eliminated from certain necessary types of empirical investigation; moreover, these journals need to learn new ways to describe these sources of uncertainty with objectivity and candor.


62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI


[1] Cochran, W. G. (1965). The planning of observational studies of human populations (with discussion). J. R. Stat. Soc., \(A 128 234-266\).
[2] Cochran, W. G. and Rubin, D. B. (1973). Controlling bias in observational studies: A review. Sankhya, Ser. A 35 417-446. · Zbl 0291.62012
[3] Daniel, S. R., Armstrong, K., Silber, J. H. and Rosenbaum, P. R. (2008). An algorithm for optimal tapered matching, with application to disparities in survival. J. Comput. Graph. Statist. 17 914-924. · doi:10.1198/106186008X385806
[4] Johannes, C. B., Crawford, S. L., Posner, J. G. and McKinlay, S. M. (1994). Longitudinal patterns and correlates of hormone replacement therapy use in middle-aged women. Am. J. Epidemiol. 140 439-452.
[5] Kelz, R. R., Sellers, M. M., Niknam, B. A., Sharpe, J. E., Rosenbaum, P. R., Hill, A. S., Zhou, H., Hochman, L. L., Bilimoria, K. Y. et al. (2021). A national comparison of operative outcomes of new and experienced surgeons. Ann. Surg. 273 280-288.
[6] Kullback, S. (1959). Information Theory and Statistics. Wiley, New York. · Zbl 0088.10406
[7] Kullback, S. and Leibler, R. A. (1951). On information and sufficiency. Ann. Math. Stat. 22 79-86. · Zbl 0042.38403 · doi:10.1214/aoms/1177729694
[8] Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, 3rd ed. Springer Texts in Statistics. Springer, New York. · Zbl 1076.62018
[9] Macpherson, H., Pipingas, A. and Pase, M. P. (2013). Multivitamin-multimineral supplementation and mortality: A meta-analysis of randomized controlled trials. Am. J. Clin. Nutr. 97 437-444.
[10] Matthews, K. A., Kuller, L. H., Wing, R. R., Meilahn, E. N. and Plantinga, P. (1996). Prior to use of estrogen replacement therapy, are users healthier than nonusers? Am. J. Epidemiol. 143 971-978.
[11] Niknam, B. A., Arriaga, A. F., Rosenbaum, P. R., Hill, A. S., Ross, R. N., Even-Shoshan, O., Romano, P. S. and Silber, J. H. (2018). Adjustment for atherosclerosis diagnosis distorts the effects of percutaneous coronary intervention and the ranking of hospital performance. J. Amer. Heart Assoc. 7. · doi:10.1161/JAHA.117.008366
[12] O’Brien, P. C. and Fleming, T. R. (1987). A paired Prentice-Wilcoxon test for censored paired data. Biometrics 43 169-180.
[13] Petitti, D. B. and Freedman, D. A. (2005). Invited commentary: How far can epidemiologists get with statistical adjustment? Am. J. Epidemiol. 162 415-418.
[14] Petitti, D. B., Perlman, J. A. and Sidney, S. (1986). Letter about ‘Postmenopausal estrogen use and heart disease’. N. Engl. J. Med. 315 131-132.
[15] Pimentel, S. D., Small, D. S. and Rosenbaum, P. R. (2016). Constructed second control groups and attenuation of unmeasured biases. J. Amer. Statist. Assoc. 111 1157-1167. · doi:10.1080/01621459.2015.1076342
[16] Prentice, R. L., Langer, R., Stefanick, M. L., Howard, B. V., Pettinger, M., Anderson, G., Barad, D., Curb, J. D., Kotchen, J. et al. (2005). Combined postmenopausal hormone therapy and cardiovascular disease: Toward resolving the discrepancy between non-experimental studies and the Women’s Health Initiative clinical trial. Am. J. Epidemiol. 162 404-420. · doi:10.1093/aje/kwi223
[17] Rosenbaum, P. R. (1987). Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika 74 13-26. · Zbl 0605.62130 · doi:10.1093/biomet/74.1.13
[18] Rosenbaum, P. R. (1989). The role of known effects in observational studies. Biometrics 45 557-569. · Zbl 0715.62185 · doi:10.2307/2531497
[19] Rosenbaum, P. R. (1991). Discussing hidden bias in observational studies. Ann. Intern. Med. 115 901-905.
[20] Rosenbaum, P. R. (2002). Observational Studies, 2nd ed. Springer Series in Statistics. Springer, New York. · Zbl 0985.62091 · doi:10.1007/978-1-4757-3692-2
[21] Rosenbaum, P. R. (2017). Observation and Experiment: An Introduction to Causal Inference. Harvard Univ. Press, Cambridge, MA. · Zbl 1372.00054
[22] Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41-55. · Zbl 0522.62091 · doi:10.1093/biomet/70.1.41
[23] Rosenbaum, P. R. and Rubin, D. B. (1985). The bias due to incomplete matching. Biometrics 41 103-116. · Zbl 0607.62137 · doi:10.2307/2530647
[24] Rosenbaum, P. R. and Silber, J. H. (2009). Amplification of sensitivity analysis in matched observational studies. J. Amer. Statist. Assoc. 104 1398-1405. · Zbl 1205.62180 · doi:10.1198/jasa.2009.tm08470
[25] Rosenbaum, P. R. and Silber, J. H. (2013). Using the exterior match to compare two entwined matched control groups. Amer. Statist. 67 67-75. · Zbl 07649184 · doi:10.1080/00031305.2013.769914
[26] Rutter, M., ed. (2007). Identifying the Environmental Causes of Disease. Academy of Medical Sciences, London.
[27] Silber, J. H., Rosenbaum, P. R., Clark, A. S., Giantonio, B. J., Ross, R. N., Teng, Y., Wang, M., Niknam, B. A., Ludwig, J. M. et al. (2013). Characteristics associated with differences in survival among black and white women with breast cancer. J. Am. Med. Assoc. 310 389-397.
[28] Silber, J. H., Rosenbaum, P. R., Ross, R. N., Niknam, B. A., Ludwig, J. M., Wang, W., Clark, A. S., Fox, K. R., Wang, M. et al. (2014). Racial disparities in colon cancer survival: A matched cohort study. Ann. Intern. Med. 161 845-854.
[29] Silber, J. H., Rosenbaum, P. R., Ross, R. N., Reiter, J. G., Niknam, B. A., Hill, A. S., Bongiorno, D. M., Shah, S. A., Hochman, L. L. et al. (2018). Disparities in breast cancer survival by socioeconomic status despite medicare and medicaid insurance. Milbank Q. 96 706-754. · doi:10.1111/1468-0009.12355
[30] Stampfer, M. J., Willett, W. C., Colditz, G. A., Rosner, B., Speizer, F. E. and Hennekens, C. H. (1985). A prospective study of postmenopausal estrogen therapy and coronary heart disease. N. Engl. J. Med. 313 1044-1049.
[31] Women’s Health Initiative Study Writing Group (1998). Design of the Women’s Health Initiative clinical trial and observational study. Control. Clin. Trials 19 61-109.
[32] Yu, R. (2020). Evaluating and improving a matched comparison of antidepressants and bone density. Biometrics. · Zbl 1520.62393 · doi:10.1111/biom.13374
[33] Yu, R., Silber, J. H. and Rosenbaum, P. R. (2020). Matching methods for observational studies derived from large administrative databases. Statist. Sci. 35 338-355. · Zbl 07292518 · doi:10.1214/19-STS699
[34] Yu, R., Small, D. S. and Rosenbaum, P. R. (2021). Supplement to “The information in covariate imbalance in studies of hormone replacement therapy.” https://doi.org/10 · Zbl 1498.62272
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.