Zhao, Anqi; Lee, Youjin; Small, Dylan S.; Karmakar, Bikram Evidence factors from multiple, possibly invalid, instrumental variables. (English) Zbl 1539.62335 Ann. Stat. 50, No. 3, 1266-1296 (2022). Summary: Valid instrumental variables enable treatment effect inference even when selection into treatment is biased by unobserved confounders. When multiple candidate instruments are available, but some of them are possibly invalid, the previously proposed reinforced design enables one or more nearly independent valid analyses that depend on very different assumptions. That is, we can perform evidence factor analysis. However, the validity of the reinforced design depends crucially on the order in which multiple instrumental variable analyses are conducted. Motivated by the orthogonality of balanced factorial designs, we propose a balanced block design to offset the possible violation of the exclusion restriction by balancing the instruments against each other in the design, and demonstrate its utility for constructing approximate evidence factors under multiple analysis strategies free of the order imposition. We also propose a novel stratification method using multiple, nested candidate instruments, in which case the balanced block design is not applicable. We apply our proposed methods to evaluate (a) the effect of education on future earnings using instrumental variables arising from the disruption of education during World War II via the balanced block design, and (b) the causal effect of malaria on stunting among children in Western Kenya using three nested instruments. MSC: 62P10 Applications of statistics to biology and medical sciences; meta analysis 62D20 Causal inference from observational studies 62G10 Nonparametric hypothesis testing Keywords:bias in observational studies; causal inference; exclusion restriction; nonparametric tests; replication; sensitivity analysis Software:sensitivitymw; sensitivitymv; evidenceFactors × Cite Format Result Cite Review PDF Full Text: DOI References: [1] SOCIO-ECONOMIC PANEL (SOEP) (2018). Data for years 1984-2018, v35i, SOEP. [2] Angrist, J. D., Imbens, G. W. and Rubin, D. B. (1996). Identification of causal effects using instrumental variables. J. Amer. Statist. Assoc. 91 444-455. · Zbl 0897.62130 [3] Angrist, J. D. and Krueger, A. B. (1991). Does compulsory school attendance affect schooling and earnings? Q. J. Econ. 106 979-1014. [4] ATEBA, F. F., DOUMBIA, S., TER KUILE, F. O., TERLOUW, D. J., LEFEBVRE, G., KARIUKI, S. and SMALL, D. S. (2021). The effect of malaria on stunting: An instrumental variables approach. Trans. R. Soc. Trop. Med. Hyg. In press. [5] BECKER, B. J. (1994). Combining significance levels. In The Handbook of Research Synthesis (H. Cooper and L. V. Hedges, eds.) 215-230. [6] Benjamini, Y. and Heller, R. (2008). Screening for partial conjunction hypotheses. Biometrics 64 1215-1222. · Zbl 1152.62045 · doi:10.1111/j.1541-0420.2007.00984.x [7] Bowden, J., Davey Smith, G. and Burgess, S. (2015). Mendelian randomization with invalid instruments: Effect estimation and bias detection through egger regression. Int. J. Epidemiol. 44 512-525. [8] Bowden, J., Davey Smith, G., Haycock, P. C. and Burgess, S. (2016). Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol. 40 304-314. [9] BRESLOW, N. (1970). A generalized Kruskal-Wallis test for comparing \(K\) samples subject to unequal patterns of censorship. Biometrika 57 579-594. · Zbl 0215.26403 [10] Burgess, S., Butterworth, A. and Thompson, S. G. (2013). Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol. 37 658-665. [11] BURGESS, S., DUDBRIDGE, F. and THOMPSON, S. G. (2016). Combining information on multiple instrumental variables in Mendelian randomization: Comparison of allele score and summarized data methods. Stat. Med. 35 1880-1906. · doi:10.1002/sim.6835 [12] BURGESS, S., BOWDEN, J., FALL, T., INGELSSON, E. and THOMPSON, S. G. (2017). Sensitivity analyses for robust causal inference from Mendelian randomization analyses with multiple genetic variants. Epidemiology 28 30-42. · doi:10.1097/EDE.0000000000000559 [13] CARD, D. (1993). Using geographic variation in college proximity to estimate the return to schooling. NBER Working Paper w4483. [14] CARD, D. (1999). The causal effect of education on earnings (O. C. Ashenfelter and D. Card, eds.). Handbook of Labor Economics 3 1801-1863. Elsevier, Amsterdam. [15] CAUGHEY, D., DAFOE, A., LI, X. and MIRATRIX, L. (2021). Randomization inference beyond the sharp null: Bounded null hypotheses and quantiles of individual treatment effects. ArXiv preprint. Available at arXiv:2101.09195. [16] DAVIES, N. M., VON HINKE KESSLER SCHOLDER, S., FARBMACHER, H., BURGESS, S., WINDMEIJER, F. and DAVEY SMITH, G. (2015). The many weak instruments problem and Mendelian randomization. Stat. Med. 34 454-468. · doi:10.1002/sim.6358 [17] DEATON, A. (2010). Instruments, randomization, and learning about development. J. Econ. Lit. 48 424-55. [18] DEL GRECO M., F., MINELLI, C., SHEEHAN, N. A. and THOMPSON, J. R. (2015). Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat. Med. 34 2926-2940. · doi:10.1002/sim.6522 [19] FISHER, R. A. (1926). The arrangement of field experiments. J. Minist. Agric. G. B. 33 503-513. [20] FOGARTY, C. B.and HASEGAWA, R. B. (2019). Extended sensitivity analysis for heterogeneous unmeasured confounding with an application to sibling studies of returns to education. Ann. Appl. Stat. 13 767-796. · Zbl 1423.62040 · doi:10.1214/18-AOAS1215 [21] GAIL, M. H., TAN, W. Y. and PIANTADOSI, S. (1988). Tests for no treatment effect in randomized clinical trials. Biometrika 75 57-64. · Zbl 0635.62108 · doi:10.1093/biomet/75.1.57 [22] Guo, Z., Kang, H., Cai, T. T. and Small, D. S. (2018). Confidence intervals for causal effects with invalid instruments by using two-stage hard thresholding with voting. J. R. Stat. Soc. Ser. B. Stat. Methodol. 80 793-815. · Zbl 1398.62114 · doi:10.1111/rssb.12275 [23] HADLEY, J., POLSKY, D., MANDELBLATT, J. S., MITCHELL, J. M., WEEKS, J. C., WANG, Q., HWANG, Y.-T. and TEAM, O. R. (2003). An exploratory instrumental variable analysis of the outcomes of localized breast cancer treatments in a medicare population. Health Econ. 12 171-186. [24] HALSEY, A. H., HALSEY, A. H., ALBERT HENRY, H., HEATH, A. F., RIDGE, J. M. et al. (1980). Origins and Destinations: Family, Class, and Education in Modern Britain. Clarendon Press, Oxford; Oxford Univ. Press. New York. [25] Han, C. (2008). Detecting invalid instruments using \[{L_1}\]-GMM. Econom. Lett. 101 285-287. · Zbl 1255.62351 · doi:10.1016/j.econlet.2008.09.004 [26] HARMON, C. and WALKER, I. (1995). Estimates of the economic return to schooling for the United Kingdom. Am. Econ. Rev. 85 1278-1286. [27] HASEGAWA, R. and SMALL, D. (2017). Sensitivity analysis for matched pair analysis of binary data: From worst case to average case analysis. Biometrics 73 1424-1432. · Zbl 1405.62048 · doi:10.1111/biom.12688 [28] HENG, S. and SMALL, D. S. (2021). Sharpening the Rosenbaum sensitivity bounds to address concerns about interactions between observed and unobserved covariates. Statist. Sinica 31 2331-2353. · Zbl 1462.30045 · doi:10.1007/s12220-019-00351-8 [29] HENG, S., SMALL, D. S. and ROSENBAUM, P. R. (2020). Finding the strength in a weak instrument in a study of cognitive outcomes produced by catholic high schools. J. Roy. Statist. Soc. Ser. A 183 935-958. [30] HSU, J. Y., SMALL, D. S. and ROSENBAUM, P. R. (2013). Effect modification and design sensitivity in observational studies. J. Amer. Statist. Assoc. 108 135-148. · Zbl 06158331 · doi:10.1080/01621459.2012.742018 [31] ICHINO, A. and WINTER-EBMER, R. (2004). The long-run educational cost of World War II. J. Labor Econ. 22 57-87. [32] IMBENS, G. W. (2003). Sensitivity to exogeneity assumptions in program evaluation. Am. Econ. Rev. 93 126-132. [33] IMBENS, G. W. (2010). Better LATE than nothing: Some comments on Deaton (2009) and Heckman and Urzua (2009). J. Econ. Lit. 48 399-423. [34] JACKSON, B. D. and BLACK, R. E. (2017). A literature review of the effect of malaria on stunting. J. Nutr. 147 2163S-2168S. · doi:10.3945/jn.116.242289 [35] Kang, H., Zhang, A., Cai, T. T. and Small, D. S. (2016). Instrumental variables estimation with some invalid instruments and its application to Mendelian randomization. J. Amer. Statist. Assoc. 111 132-144. · doi:10.1080/01621459.2014.994705 [36] KANG, H., LEE, Y., CAI, T. T. and SMALL, D. S. (2021). Two robust tools for inference about causal effects with invalid instruments. Biometrics. (In press). [37] KARMAKAR, B., DOUBENI, C. A. and SMALL, D. S. (2020). Evidence factors in a case-control study with application to the effect of flexible sigmoidoscopy screening on colorectal cancer. Ann. Appl. Stat. 14 829-849. · Zbl 1446.62276 · doi:10.1214/20-AOAS1329 [38] KARMAKAR, B., FRENCH, B. and SMALL, D. S. (2019). Integrating the evidence from evidence factors in observational studies. Biometrika 106 353-367. · Zbl 1435.62390 · doi:10.1093/biomet/asz003 [39] KARMAKAR, B. and SMALL, D. S. (2020). Assessment of the extent of corroboration of an elaborate theory of a causal hypothesis using partial conjunctions of evidence factors. Ann. Statist. 48 3283-3311. · Zbl 1464.62272 · doi:10.1214/19-AOS1929 [40] KARMAKAR, B., SMALL, D. S. and ROSENBAUM, P. R. (2020). Using evidence factors to clarify exposure biomarkers. Am. J. Epidemiol. 189 243-249. [41] KARMAKAR, B., SMALL, D. S. and ROSENBAUM, P. R. (2021b). Reinforced designs: Multiple instruments plus control groups as evidence factors in an observational study of the effectiveness of Catholic schools. J. Amer. Statist. Assoc. 116 82-92. · Zbl 1457.62239 · doi:10.1080/01621459.2020.1745811 [42] Kolesár, M., Chetty, R., Friedman, J., Glaeser, E. and Imbens, G. W. (2015). Identification and inference with many invalid instruments. J. Bus. Econom. Statist. 33 474-484. · doi:10.1080/07350015.2014.978175 [43] LORCH, S. A., BAIOCCHI, M., AHLBERG, C. E. and SMALL, D. S. (2012). The differential impact of delivery hospital on the outcomes of premature infants. Pediatrics 130 270-278. [44] NATTINO, G., LU, B., SHI, J., LEMESHOW, S. and XIANG, H. (2021). Triplet matching for estimating causal effects with three treatment arms: A comparative study of mortality by trauma center level. J. Amer. Statist. Assoc. 116 44-53. · Zbl 1457.62056 · doi:10.1080/01621459.2020.1737078 [45] ROSENBAUM, P. R. (2001). Replicating effects and biases. Amer. Statist. 55 223-227. · doi:10.1198/000313001317098220 [46] ROSENBAUM, P. R. (2002). Covariance adjustment in randomized experiments and observational studies. Statist. Sci. 17 286-327. · Zbl 1013.62117 · doi:10.1214/ss/1042727942 [47] ROSENBAUM, P. R. (2010a). Design of Observational Studies. Springer Series in Statistics. Springer, New York. · Zbl 1308.62005 · doi:10.1007/978-1-4419-1213-8 [48] ROSENBAUM, P. R. (2010b). Evidence factors in observational studies. Biometrika 97 333-345. · Zbl 1205.62179 · doi:10.1093/biomet/asq019 [49] ROSENBAUM, P. R. (2011). Some approximate evidence factors in observational studies. J. Amer. Statist. Assoc. 106 285-295. · Zbl 1396.62109 · doi:10.1198/jasa.2011.tm10422 [50] ROSENBAUM, P.R. (2015). Two R packages for sensitivity analysis in observational studies. Obs. Stud. 1 1-17. [51] ROSENBAUM, P. R. (2017). The general structure of evidence factors in observational studies. Statist. Sci. 32 514-530. · Zbl 1384.62014 · doi:10.1214/17-STS621 [52] ROSENBAUM, P. R., ROSS, R. N. and SILBER, J. H. (2007). Minimum distance matched sampling with fine balance in an observational study of treatment for ovarian cancer. J. Amer. Statist. Assoc. 102 75-83. · Zbl 1284.62670 · doi:10.1198/016214506000001059 [53] Rubin, D. B. (1979). Using multivariate matched sampling and regression adjustment to control bias in observational studies. J. Amer. Statist. Assoc. 74 318-328. · Zbl 0413.62047 [54] SANDER, W. (1995). Schooling and quitting smoking. Rev. Econ. Stat. 191-199. [55] Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73 751-754. · Zbl 0613.62067 · doi:10.1093/biomet/73.3.751 [56] SMALL, D. S. (2007). Sensitivity analysis for instrumental variables regression with overidentifying restrictions. J. Amer. Statist. Assoc. 102 1049-1058. · Zbl 1333.62295 · doi:10.1198/016214507000000608 [57] SMALL, D. S. and ROSENBAUM, P. R. (2008). War and wages: The strength of instrumental variables and their sensitivity to unobserved biases. J. Amer. Statist. Assoc. 103 924-933. · Zbl 1205.62221 · doi:10.1198/016214507000001247 [58] SPIEKER, A. J., GREEVY, R. A., NELSON, L. A. and MAYBERRY, L. S. (2020). Bounding the local average treatment effect in an instrumental variable analysis of engagement with a mobile intervention. ArXiv preprint. Available at arXiv:2008.06473. · Zbl 1498.62255 [59] TAN, Z. (2006). Regression and weighting methods for causal inference using instrumental variables. J. Amer. Statist. Assoc. 101 1607-1618. · Zbl 1171.62364 · doi:10.1198/016214505000001366 [60] TIAN, J. and RAMDAS, A. (2019). ADDIS: An adaptive discarding algorithm for online FDR control with conservative nulls. In Advances in Neural Information Processing Systems 9388-9396. [61] TUKEY, J. W. (1993). Tightening the clinical trial. Control. Clin. Trials 14 266-285. [62] VOORS, M. J., NILLESEN, E. E., VERWIMP, P., BULTE, E. H., LENSINK, R. and VAN SOEST, D. P. (2012). Violent conflict and behavior: A field experiment in Burundi. Am. Econ. Rev. 102 941-64. [63] WAGNER, G. G., FRICK, J. R. and SCHUPP, J. (2007). The German Socio-Economic Panel study (SOEP)-evolution, scope and enhancements. SOEP papers on Multidisciplinary Panel Data Research No. 1 DIW Berlin, The German Socio-Economic Panel (SOEP). [64] WALKER, V. M., DAVIES, N. M., MARTIN, R. M. and KEHOE, P. G. (2020). Comparison of antihypertensive drug classes for dementia prevention. Epidemiology 31 852. [65] WANG, X., JIANG, Y., ZHANG, N. R. and SMALL, D. S. (2018). Sensitivity analysis and power for instrumental variable studies. Biometrics 74 1150-1160. · doi:10.1111/biom.12873 [66] WILSON, A. L., DHIMAN, R. C., KITRON, U., SCOTT, T. W., VAN DEN BERG, H. and LINDSAY, S. W. (2014). Benefit of insecticide-treated nets, curtains and screening on vector borne diseases, excluding malaria: A systematic review and meta-analysis. PLoS Negl. Trop. Dis. 8 e3228. [67] Windmeijer, F., Farbmacher, H., Davies, N. and Smith, G. D. (2019). On the use of the Lasso for instrumental variables estimation with some invalid instruments. J. Amer. Statist. Assoc. 114 1339-1350. · Zbl 1428.62167 · doi:10.1080/01621459.2018.1498346 [68] WU, J. and DING, P. (2021). Randomization Tests for Weak Null Hypotheses in Randomized Experiments. J. Amer. Statist. Assoc. 116 1898-1913. · Zbl 1506.62255 · doi:10.1080/01621459.2020.1750415 [69] ZAYKIN, D. V., ZHIVOTOVSKY, L. A., WESTFALL, P. H. and WEIR, B. S. (2002). Truncated product method for combining P-values. Genet. Epidemiol. 22 170-185. [70] ZENG, S., LI, F. and DING, P. (2020). Is being the only child harmful to psychological health?: Evidence from an instrumental variable analysis of China’s One-Child Policy. J. Roy. Statist. Soc. Ser. A 183 1615-1635. [71] ZENG, D., THOMSEN, M. R., NAYGA JR, R. M. and ROUSE, H. L. (2019). Neighbourhood convenience stores and childhood weight outcomes: An instrumental variable approach. Appl. Econ. 51 288-302. [72] ZHANG, K., SMALL, D. S., LORCH, S., SRINIVAS, S. and ROSENBAUM, P. R. (2011). Using split samples and evidence factors in an observational study of neonatal outcomes. J. Amer. Statist. Assoc. 106 511-524. · Zbl 1232.62156 · doi:10.1198/jasa.2011.ap10604 [73] ZHAO, Q., SMALL, D. S. and SU, W. (2019). Multiple testing when many \(p\)-values are uniformly conservative, with application to testing qualitative interaction in educational interventions. J. Amer. Statist. Assoc. 114 1291-1304. · Zbl 1428.62348 · doi:10.1080/01621459.2018.1497499 [74] ZHAO, Q., LEE, Y., SMALL, D. S. and KARMAKAR, B. (2022). Supplement to “Evidence factors from multiple, possibly invalid, instrumental variables.” https://doi.org/10.1214/21-AOS2148SUPP [75] ZUBIZARRETA, J. R., NEUMAN, M., SILBER, J. H. and ROSENBAUM, P. R. (2012). Contrasting evidence within and between institutions that provide treatment in an observational study of alternative forms of anesthesia. J. Amer. Statist. Assoc. 107 901-915 · Zbl 1443.62397 · doi:10.1080/01621459.2012.682533 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.