## Estimation of regression coefficients when some regressors are not always observed.(English)Zbl 0815.62043

Summary: In applied problems it is common to specify a model for the conditional mean of a response given a set of regressors. A subset of the regressors may be missing for some study subjects either by design or happenstance. We propose a new class of semiparametric estimators, based on inverse probability weighted estimating equations, that are consistent for the parameter vector $$\alpha_ 0$$ of the conditional mean model when the data are missing at random in the sense of D. B. Rubin [Biometrika 63, 581-592 (1976; Zbl 0344.62034)] and the missingness probabilities are either known or can be parametrically modeled.
We show that the asymptotic variance of the optimal estimator in our class attains the semiparametric variance bound for the model by first showing that our estimation problem is a special case of the general problem of parameter estimation in an arbitrary semiparametric model in which the data are missing at random and the probability of observing complete data is bounded away from 0, and then deriving a representation for the efficient score, the semiparametric variance bound and the influence function of any regular, asymptotically linear estimator in this more general estimation problem. Because the optimal estimator depends on the unknown probability law generating the data, we propose locally and globally adaptive semiparametric efficient estimators.
We compare estimators in our class with previously proposed estimators. We show that each previous estimator is asymptotically equivalent to some, usually inefficient, estimator in our class. This equivalence is a consequence of a proposition stating that every regular asymptotic linear estimator of $$\alpha_ 0$$ is asymptotically equivalent to some estimator in our class. We compare various estimators in a small simulation study and offer some practical recommendations.

### MSC:

 62J05 Linear regression; mixed models 62J02 General nonlinear regression 62G07 Density estimation

Zbl 0344.62034
Full Text: