Bounds on the conditional and average treatment effect with unobserved confounding factors. (English) Zbl 07628833

Summary: For observational studies, we study the sensitivity of causal inference when treatment assignments may depend on unobserved confounders. We develop a loss minimization approach for estimating bounds on the conditional average treatment effect (CATE) when unobserved confounders have a bounded effect on the odds ratio of treatment selection. Our approach is scalable and allows flexible use of model classes in estimation, including nonparametric and black-box machine learning methods. Based on these bounds for the CATE, we propose a sensitivity analysis for the average treatment effect (ATE). Our semiparametric estimator extends/bounds the augmented inverse propensity weighted (AIPW) estimator for the ATE under bounded unobserved confounding. By constructing a Neyman orthogonal score, our estimator of the bound for the ATE is a regular root-\(n\) estimator so long as the nuisance parameters are estimated at the \({o_p}({n^{-1/4}})\) rate. We complement our methodology with optimality results showing that our proposed bounds are tight in certain cases. We demonstrate our method on simulated and real data examples, and show accurate coverage of our confidence intervals in practical finite sample regimes with rich covariate information.


62F03 Parametric hypothesis testing
62F30 Parametric inference under constraints
62H12 Estimation in multivariate analysis
62H15 Hypothesis testing in multivariate analysis


grf; XGBoost
Full Text: DOI arXiv


[1] Abadie, A. and Imbens, G. W. (2006). Large sample properties of matching estimators for average treatment effects. Econometrica 74 235-267. · Zbl 1112.62042
[2] ARONOW, P. M. and LEE, D. K. K. (2013). Interval estimation of population means under unknown but bounded probabilities of sample selection. Biometrika 100 235-240. · Zbl 1452.62159
[3] ATHEY, S. and IMBENS, G. (2016). Recursive partitioning for heterogeneous causal effects. Proc. Natl. Acad. Sci. USA 113 7353-7360. · Zbl 1357.62190
[4] Athey, S., Tibshirani, J. and Wager, S. (2019). Generalized random forests. Ann. Statist. 47 1148-1178. · Zbl 1418.62102
[5] Bang, H. and Robins, J. M. (2005). Doubly robust estimation in missing data and causal inference models. Biometrics 61 962-972. · Zbl 1087.62121
[6] BOSCO, J. L., SILLIMAN, R. A., THWIN, S. S., GEIGER, A. M., BUIST, D. S., PROUT, M. N., YOOD, M. U., HAQUE, R., WEI, F. et al. (2010). A most stubborn bias: No adjustment method fully resolves confounding by indication in observational studies. J. Clin. Epidemiol. 63 64-74.
[7] Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge. · Zbl 1058.90049
[8] BRUMBACK, B. A., HERNÁN, M. A., HANEUSE, S. J. P. A. and ROBINS, J. M. (2004). Sensitivity analyses for unmeasured confounding assuming a marginal structural model for repeated measures. Stat. Med. 23 749-767.
[9] CHEN, T. and GUESTRIN, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’16 785-794. ACM, New York, NY, USA.
[10] CHEN, X. (2007). Large sample sieve estimation of semi-nonparametric models. Handb. Econom. 6 5549-5632.
[11] CHEN, X. and CHRISTENSEN, T. M. (2015). Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions. J. Econometrics 188 447-465. · Zbl 1337.62101
[12] CHEN, X. and WHITE, H. (1999). Improved rates and asymptotic normality for nonparametric neural network estimators. IEEE Trans. Inf. Theory 45 682-691. · Zbl 1098.92502
[13] Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W. and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. Econom. J. 21 C1-C68. · Zbl 07565928
[14] COLOMA, P. M., TRIFIRÒ, G., SCHUEMIE, M. J., GINI, R., HERINGS, R., HIPPISLEY-COX, J., MAZZAGLIA, G., PICELLI, G., CORRAO, G. et al. (2012). Electronic healthcare databases for active drug safety surveillance: Is there enough leverage? Pharmacoepidemiol. Drug Saf. 21 611-621.
[15] Cornfield, J., Haenszel, W., Hammond, E. C., Lilienfeld, A. M., Shimkin, M. B. and Wynder, E. L. (1959). Smoking and lung cancer: Recent evidence and a discussion of some questions. J. Natl. Cancer Inst. 22 173-203.
[16] Daubechies, I. (1992). Ten Lectures on Wavelets. CBMS-NSF Regional Conference Series in Applied Mathematics 61. SIAM, Philadelphia, PA. · Zbl 0776.42018
[17] FOGARTY, C. B. and SMALL, D. S. (2016). Sensitivity analysis for multiple comparisons in matched observational studies through quadratically constrained linear programming. J. Amer. Statist. Assoc. 111 1820-1830.
[18] FRANKS, A. M., D’AMOUR, A. and FELLER, A. (2020). Flexible sensitivity analysis for observational studies without observable implications. J. Amer. Statist. Assoc. 115 1730-1746. · Zbl 1452.62177
[19] GEMAN, S. and HWANG, C.-R. (1982). Nonparametric maximum likelihood estimation by the method of sieves. Ann. Statist. 10 401-414. · Zbl 0494.62041
[20] GYÖRFI, L., KOHLER, M., KRZYŻAK, A. and WALK, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, Berlin. · Zbl 1021.62024
[21] Hahn, J. (1998). On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica 66 315-331. · Zbl 1055.62572
[22] Hill, J. L. (2011). Bayesian nonparametric modeling for causal inference. J. Comput. Graph. Statist. 20 217-240.
[23] Hirano, K., Imbens, G. W. and Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71 1161-1189. · Zbl 1152.62328
[24] IMBENS, G. W. (2003). Sensitivity to exogeneity assumptions in program evaluation. Am. Econ. Rev. 93 126-132.
[25] IMBENS, G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. Rev. Econ. Stat. 86 4-29.
[26] IMBENS, G. W. and RUBIN, D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge Univ. Press, New York. · Zbl 1355.62002
[27] KALLUS, N., MAO, X. and ZHOU, A. (2019). Interval estimation of individual-level causal effects under unobserved confounding. In The 22nd International Conference on Artificial Intelligence and Statistics 2281-2290.
[28] KALLUS, N. and ZHOU, A. (2018). Confounding-robust policy improvement. Available at https://papers.nips.cc/paper/2018/hash/3a09a524440d44d7f19870070a5ad42f-Abstract.html.
[29] KENNEDY, E. H. (2020). Optimal doubly robust estimation of heterogeneous causal effects. arXiv preprint. Available at arXiv:2004.14497 [math.ST].
[30] KÜNZEL, S. R., SEKHON, J. S., BICKEL, P. J. and YU, B. (2017). Meta-learners for estimating heterogeneous treatment effects using machine learning. Available at https://www.pnas.org/doi/10.1073/pnas.1804597116.
[31] LEE, B. K., LESSLER, J. and STUART, E. A. (2011). Weight trimming and propensity score weighting. PLoS ONE 6 e18174.
[32] LUENBERGER, D. G. (1969). Optimization by Vector Space Methods. Wiley, New York.
[33] Miratrix, L. W., Wager, S. and Zubizarreta, J. R. (2018). Shape-constrained partial identification of a population mean under unknown probabilities of sample selection. Biometrika 105 103-114. · Zbl 07072396
[34] NEWEY, W. K. (1994). Kernel estimation of partial means and a general variance estimator. Econometric Theory 10 233-253.
[35] Newey, W. K. (1994). The asymptotic variance of semiparametric estimators. Econometrica 62 1349-1382. · Zbl 04522636
[36] NEWEY, W. K. (1997). Convergence rates and asymptotic normality for series estimators. J. Econometrics 79 147-168. · Zbl 04538672
[37] NEYMAN, J. (1959). Optimal asymptotic tests of composite statistical hypotheses. Probab. Stat. 416. · Zbl 0104.12602
[38] NIE, X. and WAGER, S. (2021). Quasi-oracle estimation of heterogeneous treatment effects. Biometrika 108 299-319. · Zbl 07458256
[39] NORTON, E. C., DOWD, B. E. and MACIEJEWSKI, M. L. (2018). Odds ratios-current best practice and use. JAMA 320 84-85.
[40] RICHARDSON, A., HUDGENS, M. G., GILBERT, P. B. and FINE, J. P. (2014). Nonparametric bounds and sensitivity analysis of treatment effects. Statist. Sci. 29 596-618. · Zbl 1331.62205
[41] Robins, J. M., Rotnitzky, A. and Scharfstein, D. O. (2000). Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In Statistical Models in Epidemiology, the Environment, and Clinical Trials (Minneapolis, MN, 1997). IMA Vol. Math. Appl. 116 1-94. Springer, New York. · Zbl 0998.62091
[42] Rockafellar, R. T. and Wets, R. J.-B. (1998). Variational Analysis. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 317. Springer, Berlin. · Zbl 0888.49001
[43] Rosenbaum, P. R. (2002). Observational Studies, 2nd ed. Springer Series in Statistics. Springer, New York. · Zbl 0985.62091
[44] ROSENBAUM, P. R. (2002). Covariance adjustment in randomized experiments and observational studies. Statist. Sci. 17 286-327. · Zbl 1013.62117
[45] ROSENBAUM, P. R. (2010). Design of Observational Studies. Springer Series in Statistics. Springer, New York. · Zbl 1308.62005
[46] ROSENBAUM, P. R. (2011). A new u-statistic with superior design sensitivity in matched observational studies. Biometrics 67 1017-1027. · Zbl 1226.62125
[47] ROSENBAUM, P. R. (2014). Weighted \(M\)-statistics with superior design sensitivity in matched observational studies with multiple controls. J. Amer. Statist. Assoc. 109 1145-1158. · Zbl 1368.62290
[48] Scharfstein, D. O., Rotnitzky, A. and Robins, J. M. (1999). Adjusting for nonignorable drop-out using semiparametric nonresponse models. J. Amer. Statist. Assoc. 94 1096-1146. · Zbl 1072.62644
[49] SCHUMAKER, L. L. (2007). Spline Functions: Basic Theory, 3rd ed. Cambridge Mathematical Library. Cambridge Univ. Press, Cambridge. · Zbl 1123.41008
[50] SHEN, C., LI, X., LI, L. and WERE, M. C. (2011). Sensitivity analysis for causal inference using inverse probability weighting. Biom. J. 53 822-837. · Zbl 1226.62126
[51] Stone, C. J. (1980). Optimal rates of convergence for nonparametric estimators. Ann. Statist. 8 1348-1360. · Zbl 0451.62033
[52] TIMAN, A. F. (1963). Theory of Approximation of Functions of a Real Variable. A Pergamon Press Book. The Macmillan Company, New York.
[53] TSIATIS, A. A. and DAVIDIAN, M. (2007). Comment: Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data [MR2420458]. Statist. Sci. 22 569-573. · Zbl 1246.62078
[54] VANDERWEELE, T. J. and DING, P. (2017). Sensitivity analysis in observational research: Introducing the E-value. Ann. Intern. Med. 167 268-274.
[55] Wager, S. and Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc. 113 1228-1242. · Zbl 1402.62056
[56] WAGER, S. and WALTHER, G. (2015). Adaptive concentration of regression trees, with application to random forests. Available at arXiv:1503.06388 [math.ST].
[57] YADLOWSKY, S., NAMKOONG, H., BASU, S., DUCHI, J. and TIAN, L. (2022). Supplement to “Bounds on the Conditional and Average Treatment Effect with Unobserved Confounding Factors.” https://doi.org/10.1214/22-AOS2195SUPP · Zbl 07628833
[58] ZHAO, Q., SMALL, D. S. and BHATTACHARYA, B. B. (2019). Sensitivity analysis for inverse probability weighting estimators via the percentile bootstrap. J. R. Stat. Soc. Ser. B. Stat. Methodol. 81 735-761 · Zbl 1428.62172
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.