Chen, Yen-Chi Pattern graphs: a graphical approach to nonmonotone missing data. (English) Zbl 1486.62177 Ann. Stat. 50, No. 1, 129-146 (2022). Summary: We introduce the concept of pattern graphs – directed acyclic graphs representing how response patterns are associated. A pattern graph represents an identifying restriction that is nonparametrically identified/saturated and is often a missing not at random restriction. We introduce a selection model and a pattern mixture model formulations using the pattern graphs and show that they are equivalent. A pattern graph leads to an inverse probability weighting estimator as well as an imputation-based estimator. We also study the semiparametric efficiency theory and derive a multiply-robust estimator using pattern graphs. Cited in 1 Document MSC: 62H22 Probabilistic graphical models 62D10 Missing data Keywords:missing data; nonignorable missingness; nomonotone missing; inverse probability weighting; pattern graphs; selection models × Cite Format Result Cite Review PDF Full Text: DOI arXiv References: [1] BHATTACHARYA, R., MALINSKY, D. and SHPITSER, I. (2020). Causal inference under interference and network uncertainty. In Uncertainty in Artificial Intelligence 1028-1038. PMLR. [2] CHEN, Y.-C. (2022). Supplement to “Pattern graphs: A graphical approach to nonmonotone missing data.” https://doi.org/10.1214/21-AOS2094SUPP [3] CHEN, Y.-C. and SADINLE, M. (2019). Nonparametric pattern-mixture models for inference with missing data. arXiv preprint. Available at arXiv:1904.11085. [4] Daniels, M. J. and Hogan, J. W. (2008). Missing Data in Longitudinal Studies: Strategies for Bayesian modeling and sensitivity analysis. Monographs on Statistics and Applied Probability 109. CRC Press/CRC, Boca Raton, FL. · Zbl 1165.62023 · doi:10.1201/9781420011180 [5] Diggle, P. J., Heagerty, P. J., Liang, K.-Y. and Zeger, S. L. (2002). Analysis of Longitudinal Data, 2nd ed. Oxford Statistical Science Series 25. Oxford Univ. Press, Oxford. [6] Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Ann. Statist. 7 1-26. · Zbl 0406.62024 [7] EFRON, B. and TIBSHIRANI, R. J. (1994). An Introduction to the Bootstrap. CRC press, Boca Raton. [8] FRIEDMAN, J., HASTIE, T. and TIBSHIRANI, R. (2001). The Elements of Statistical Learning 1. Springer series in statistics, New York. · Zbl 0973.62007 [9] GILL, R. D., VAN DER LAAN, M. J. and ROBINS, J. M. (1997). Coarsening at random: Characterizations, conjectures, counter-examples. In Proceedings of the First Seattle Symposium in Biostatistics: Survival Analysis 255-294. · Zbl 0918.62003 [10] HALL, P. (2013). The Bootstrap and Edgeworth Expansion. Springer, Berlin. [11] HOETING, J. A., MADIGAN, D., RAFTERY, A. E. and VOLINSKY, C. T. (1999). Bayesian model averaging: A tutorial. Statist. Sci. 14 382-417. · Zbl 1059.62525 · doi:10.1214/ss/1009212519 [12] HOONHOUT, P. and RIDDER, G. (2019). Nonignorable attrition in multi-period panels with refreshment samples. J. Bus. Econom. Statist. 37 377-390. · Zbl 1548.62573 · doi:10.1080/07350015.2017.1345744 [13] HOROWITZ, J. L. and MANSKI, C. F. (2000). Nonparametric analysis of randomized experiments with missing covariate and outcome data. J. Amer. Statist. Assoc. 95 77-88. · Zbl 0996.62054 · doi:10.2307/2669526 [14] LINERO, A. R. (2017). Bayesian nonparametric analysis of longitudinal studies in the presence of informative missingness. Biometrika 104 327-341. · Zbl 1506.62443 · doi:10.1093/biomet/asx015 [15] LITTLE, R. J. (1993a). Pattern-mixture models for multivariate incomplete data. J. Amer. Statist. Assoc. 88 125-134. · Zbl 0775.62134 [16] LITTLE, R. J. A. (1993b). Pattern-mixture models for multivariate incomplete data. J. Amer. Statist. Assoc. 88 125-134. · Zbl 0775.62134 [17] LITTLE, R. J. A. and RUBIN, D. B. (2002). Statistical Analysis with Missing Data, 2nd ed. Wiley Series in Probability and Statistics. Wiley Interscience, Hoboken, NJ. · Zbl 1011.62004 · doi:10.1002/9781119013563 [18] MALINSKY, D., SHPITSER, I. and TCHETGEN, E. J. T. (2019). Semiparametric inference for non-monotone missing-not-at-random data: The no self-censoring model. arXiv preprint. Available at arXiv:1909.01848. [19] MANSKI, C. F. (1990). Nonparametric bounds on treatment effects. Am. Econ. Rev. 80 319-323. [20] MOHAN, K. and PEARL, J. (2014). Graphical models for recovering probabilistic and causal queries from missing data. In Advances in Neural Information Processing Systems 1520-1528. [21] MOHAN, K. and PEARL, J. (2021). Graphical models for processing missing data. J. Amer. Statist. Assoc. 116 1023-1037. · Zbl 1464.62314 · doi:10.1080/01621459.2021.1874961 [22] MOHAN, K., PEARL, J. and TIAN, J. (2013). Graphical models for inference with missing data. In Advances in Neural Information Processing Systems 1277-1285. [23] MOLENBERGHS, G., MICHIELS, B., KENWARD, M. G. and DIGGLE, P. J. (1998). Monotone missing data and pattern-mixture models. Stat. Neerl. 52 153-161. · Zbl 0946.62034 · doi:10.1111/1467-9574.00075 [24] MOLENBERGHS, G., FITZMAURICE, G., KENWARD, M. G., TSIATIS, A. and VERBEKE, G. (2014). Handbook of Missing Data Methodology. CRC Press/CRC, Boca Raton. · Zbl 1369.62007 [25] NABI, R., BHATTACHARYA, R. and SHPITSER, I. (2020). Full law identification in graphical models of missing data: Completeness results. arXiv preprint. Available at arXiv:2004.04872. [26] ROBINS, J. M. (1997). Non-response models for the analysis of non-monotone non-ignorable missing data. Stat. Med. 16 21-37. [27] ROBINS, J. M. and GILL, R. D. (1997). Non-response models for the analysis of non-monotone ignorable missing data. Stat. Med. 16 39-56. [28] Robins, J. M., Rotnitzky, A. and Scharfstein, D. O. (2000). Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In Statistical Models in Epidemiology, the Environment, and Clinical Trials (Minneapolis, MN, 1997). IMA Vol. Math. Appl. 116 1-94. Springer, New York. · Zbl 0998.62091 · doi:10.1007/978-1-4612-1284-3_1 [29] Rubin, D. B. (2004). Multiple Imputation for Nonresponse in Surveys. Wiley Classics Library. Wiley Interscience, Hoboken, NJ. · Zbl 1070.62007 [30] SADINLE, M. and REITER, J. P. (2017). Itemwise conditionally independent nonresponse modelling for incomplete multivariate data. Biometrika 104 207-220. · Zbl 1506.62241 · doi:10.1093/biomet/asw063 [31] Seaman, S. R. and Vansteelandt, S. (2018). Introduction to double robust methods for incomplete data. Statist. Sci. 33 184-197. · Zbl 1397.62176 · doi:10.1214/18-STS647 [32] SHPITSER, I. (2016). Consistent estimation of functions of data missing non-monotonically and not at random. In Advances in Neural Information Processing Systems 3144-3152. [33] SHPITSER, I., MOHAN, K. and PEARL, J. (2015). Missing data as a causal and probabilistic problem Technical report, California Univ. Los Angeles Dept. of Computer Science. [34] SUN, B. and TCHETGEN TCHETGEN, E. J. (2018). On inverse probability weighting for nonmonotone missing at random data. J. Amer. Statist. Assoc. 113 369-379. · Zbl 1398.62264 · doi:10.1080/01621459.2016.1256814 [35] TCHETGEN TCHETGEN, E. J., WANG, L. and SUN, B. (2018). Discrete choice models for nonmonotone nonignorable missing data: Identification and inference. Statist. Sinica 28 2069-2088. · Zbl 1406.62050 [36] THIJS, H., MOLENBERGHS, G., MICHIELS, B., VERBEKE, G. and CURRAN (2002). Strategies to fit pattern-mixture models. Biostatistics 3 245-265. · Zbl 1133.62371 [37] TIAN, J. (2015). Missing at random in graphical models. In Artificial Intelligence and Statistics 977-985. [38] TSIATIS, A. (2007). Semiparametric Theory and Missing Data. Springer, Berlin. [39] VANSTEELANDT, S., GOETGHEBEUR, E., KENWARD, M. G. and MOLENBERGHS, G. (2006). Ignorance and uncertainty regions as inferential tools in a sensitivity analysis. Statist. Sinica 16 953-979 · Zbl 1108.62005 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.