On the identifiability and estimation of generalized linear models with parametric nonignorable missing data mechanism. (English) Zbl 1466.62050

Summary: We address the problem of identifying and estimating generalized linear models when the response variable is nonignorably missing. Three types of monotone missing data mechanism are assumed, including Logit model, Probit model and complementary Log-log model. In this situation, likelihood based on observed data may not be identifiable. In this article, we prove the model parameters are identifiable under very mild conditions and then construct estimators based on a likelihood-based approach. The proposed estimators are shown to be consistent and asymptotically normal. Simulation studies demonstrate that the proposed inference procedure performs well in many settings. We apply the proposed method to a data set from research in Chinese Household Income Project study.


62-08 Computational methods for problems pertaining to statistics
62J12 Generalized linear models (logistic models)
Full Text: DOI


[1] Cameron, A.C., Trivedi, P.K., 1998. Regression Analysis of Count Data. New York, NY, Cambridge. · Zbl 0924.62004
[2] Cameron, A. C.; Trivedi, P. K., Microeconometrics: methods and applications, (2005), Cambridge University Press · Zbl 1156.62092
[3] Dempster, A. P.; Laird, N. M.; Rubin, D. B., Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B Stat. Methodol., 39, 1-38, (1977) · Zbl 0364.62022
[4] Gelfang, A. E.; Sahu, S. K., Identifiability, improper priors, and Gibbs sampling for generalized linear models, J. Amer. Statist. Assoc., 94, 247-253, (1999) · Zbl 1072.62611
[5] Givens, G. H.; Hoeting, J. A., Computational statistics, (2005), John Wiley and Sons, Inc. Hoboken, NJ · Zbl 1079.62001
[6] Greenlees, W. S.; Reece, J. S.; Zieschang, K. D., Imputation of missing values when the probability of nonreponse depends on the variable being imputed, J. Amer. Statist. Assoc., 77, 251-256, (1982)
[7] Ibrahim, J. G.; Chen, M.-H.; Lipsitz, S. R., Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable, Biometrika, 88, 551-564, (2001) · Zbl 0984.62047
[8] Kim, J. K.; Shao, J., Statistical methods for handling incomplete data, (2013), Chapman and Hall/ CRC
[9] Kim, J. K.; Yu, C. Y., A semi-parametric estimation of mean functionals with non-ignorable missing data, J. Amer. Statist. Assoc., 106, 157-165, (2011)
[10] Li, S., 2002. Chinese household income project. Icpsr21741-v1, Ann Arbor, MI, Inter-university Consortium for Political and Social Research.
[11] Little, R. J.A.; Rubin, D. B., Statistical analysis with missing data, (2002), Wiley New York · Zbl 1011.62004
[12] Liu, D. P.; Zhou, X. H., A model for adjusting for nonignorable verification bias in estimation of the ROC curve and its area with likelihood-based approach, Biometrics, 66, 1119-1128, (2010) · Zbl 1233.62183
[13] McCullagh, P.; Nelder, J. A., Generalzied linear models, (1989), Chapman and Hall London · Zbl 0744.62098
[14] Miao, W.; Ding, P.; Geng, Z., Identifiability of normal and normal mixture models with nonignorable missing data, J. Amer. Statist. Assoc., (2016), (in press)
[15] Psacharopoulos, G., Returns to education: A further international update and implications, J. Hum. Resour., 20, 583-604, (1985)
[16] Psacharopoulos, G.; Mattson, R., Estimating the returns to education: a sensitivity analysis of methods and sample size, J. Educ. Dev. Adm., 12, 271-287, (1998)
[17] Qin, J.; Leung, D.; Shao, J., Estimation with survey data under nonignorable nonresponse or informative sampling, J. Amer. Statist. Assoc., 97, 193-200, (2002) · Zbl 1073.62513
[18] Tang, G.; Little, R. J.; Raghunathan, T. E., Analysis of multivariate missing data with nonignorable nonresponse, Biometrika, 90, 747-764, (2003) · Zbl 1436.62206
[19] Tang, N.; Zhao, P.; Zhu, H., Empirical likelihood and estimating equations with nonignorable missing data, Statist. Sinica, 24, 723-747, (2014) · Zbl 1285.62035
[20] Wang, S.; Shao, J.; Kim, J. K., An instrument variable approach for identification and estimation with nonignorable nonresponse, Statist. Sinica, 24, 1097-1116, (2014) · Zbl 06431822
[21] Zhao, J.; Shao, J., Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data, J. Amer. Statist. Assoc., 110, 1577-1590, (2015) · Zbl 1373.62388
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.