Estimating linear functionals in nonlinear regression with responses missing at random. (English) Zbl 1173.62052

Summary: We consider regression models with parametric (linear or nonlinear) regression functions and allow responses to be “missing at random”. We assume that the errors have mean zero and are independent of the covariates. In order to estimate expectations of functions of covariates and responses we use a fully imputed estimator, namely an empirical estimator based on estimators of conditional expectations given the covariate. We exploit the independence of covariates and errors by writing the conditional expectations as unconditional expectations, which can now be estimated by empirical plug-in estimators. The mean zero constraint on the error distribution is exploited by adding suitable residual-based weights. We prove that the estimator is efficient (in the sense of Hájek and Le Cam) if an efficient estimator of the parameter is used. Our results give rise to new efficient estimators of smooth transformations of expectations. Estimation of the mean response is discussed as a special (degenerate) case.


62J02 General nonlinear regression
62G20 Asymptotic properties of nonparametric inference
62G08 Nonparametric regression and quantile regression
65C60 Computational problems in statistics (MSC2010)
62N01 Censored data models
62F12 Asymptotic properties of parametric estimators


Full Text: DOI arXiv


[1] Bickel, P. J. (1982). On adaptive estimation. Ann. Statist. 10 647-671. · Zbl 0489.62033
[2] Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1998). Efficient and Adaptive Estimation for Semiparametric Models . Springer, New York. · Zbl 0894.62005
[3] Chen, J., Fan, J., Li, K. H. and Zhou, H. (2006). Local quasi-likelihood estimation with data missing at random. Statist. Sinica 16 1071-1100. · Zbl 1108.62038
[4] Chen, S. X. and Wang, D. (2009). Empirical likelihood for estimating equations with missing values. Ann. Statist. 37 490-517. · Zbl 1155.62021
[5] Chen, X., Hong, H. and Tarozzi, A. (2008). Semiparametric efficiency in GMM models with auxiliary data. Ann. Statist. 36 808-843. · Zbl 1133.62023
[6] Cheng, P. E. (1994). Nonparametric estimation of mean functionals with data missing at random. J. Amer. Statist. Assoc. 89 81-87. · Zbl 0800.62213
[7] Forrester, J., Hooper, W., Peng, H. and Schick, A. (2003). On the construction of efficient estimators in semiparametric models. Statist. Decisions 21 109-138. · Zbl 1040.62029
[8] Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (1995). Bayesian Data Analysis . Chapman & Hall, London. · Zbl 1279.62004
[9] Koul, H. L. and Susarla, V. (1983). Adaptive estimation in linear regression. Statist. Decisions 1 379-400. · Zbl 0574.62056
[10] Liang, H., Wang, S. and Carroll, R. J. (2007). Partially linear models with missing response variables and error-prone covariates. Biometrika 94 185-198. · Zbl 1223.62046
[11] Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis With Missing Data , 2nd ed. Wiley, New York. · Zbl 1011.62004
[12] Maity, A., Ma, Y. and Carroll, R. J. (2007). Efficient estimation of population-level summaries in general semiparametric regression models. J. Amer. Statist. Assoc. 102 123-139. · Zbl 1284.62264
[13] Matloff, N. S. (1981). Use of regression functions for improved estimation of means. Biometrika 68 685-689.
[14] Müller, U. U. (2007). Weighted least squares estimators in possibly misspecified nonlinear regression. Metrika 66 39-59. · Zbl 1433.62179
[15] Müller, U. U., Schick, A. and Wefelmeyer, W. (2005). Weighted residual-based density estimators for nonlinear autoregressive models. Statist. Sinica 15 177-195. · Zbl 1059.62035
[16] Müller, U. U., Schick, A. and Wefelmeyer, W. (2006). Imputing responses that are not missing. In Probability, Statistics and Modelling in Public Health (M. Nikulin, D. Commenges and C. Huber, eds.) 350-363. Springer, New York.
[17] Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75 237-249. · Zbl 0641.62032
[18] Owen, A. B. (2001). Empirical Likelihood. Monographs on Statistics and Applied Probability 92 . Chapman & Hall, London. · Zbl 0989.62019
[19] Qin, J. and Zhang, B. (2007). Empirical-likelihood-based inference in missing response problems and its application in observational studies. J. Roy. Statist. Soc. Ser. B 69 101-122.
[20] Schick, A. (1987). A note on the construction of asymptotically linear estimators. J. Statist. Plann. Inference 16 89-105. · Zbl 0634.62036
[21] Schick, A. (1993). On efficient estimation in regression models. Ann. Statist. 21 1486-1521. Correction and addendum: 23 (1995) 1862-1863. · Zbl 0807.62035
[22] Tamhane, A. C. (1978). Inference based on regression estimator in double sampling. Biometrika 65 419-427. · Zbl 0387.62049
[23] Tsiatis, A. A. (2006). Semiparametric Theory and Missing Data . Springer, New York. · Zbl 1105.62002
[24] Wang, Q. (2004). Likelihood-based imputation inference for mean functionals in the presence of missing responses. Ann. Inst. Statist. Math. 56 403-414. · Zbl 1057.62025
[25] Wang, Q., Linton, O. and Härdle, W. (2004). Semiparametric regression analysis with missing response at random. J. Amer. Statist. Assoc. 99 334-345. · Zbl 1117.62441
[26] Wang, Q. and Rao, J. N. K. (2001). Empirical likelihood for linear regression models under imputation for missing responses. Canad. J. Statist. 29 597-608. · Zbl 0994.62060
[27] Wang, Q. and Rao, J. N. K. (2002). Empirical likelihood-based inference under imputation for missing response data. Ann. Statist. 30 896-924. · Zbl 1029.62040
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.