Likelihood based frequentist inference when data are missing at random. (English) Zbl 1099.62503

Summary: One of the most often quoted results from the original work of Rubin and Little on the classification of missing value processes is the validity of likelihood based inferences under missing at random (MAR) mechanisms. Although the sense in which this result holds was precisely defined by Rubin, and explored by him in later work, it appears to be now used by some authors in a general and rather imprecise way, particularly with respect to the use of frequentist modes of inference. In this paper an exposition is given of likelihood based frequentist inference under an MAR mechanism that shows in particular which aspects of such inference cannot be separated from consideration of the missing value mechanism. The development is illustrated with three simple setups: a bivariate binary outcome, a bivariate Gaussian outcome and a two-stage sequential procedure with Gaussian outcome and with real longitudinal examples, involving both categorical and continuous outcomes. In particular, it is shown that the classical expected information matrix is biased and the use of the observed information matrix is recommended.


62A01 Foundations and philosophical topics in statistics
Full Text: DOI


[1] Armitage, P. (1975). Sequential Medical Trials. Blackwell, Oxford.
[2] Baker, S. G. (1992). A simple method for computing the observed information matrix when using the EM algorithm with categorical data. J. Comput. Graph. Statist. 1 63-76.
[3] Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics. Chapman and Hall, London. · Zbl 0334.62003
[4] CrĂ©peau, H., Koziol, J., Reid, N. and Yuh, Y. S. (1985). Analy sis of incomplete multivariate data from repeated measurements experiments. Biometrics 41 505-514. · Zbl 0614.62086
[5] Diggle, P. J. (1992). On informative and random dropouts in longitudinal studies. Letter to the Editor. Biometrics 48 947.
[6] Diggle, P. J. (1993). Estimation with missing data. Reply to a Letter to the Editor. Biometrics 49 580.
[7] Edwards, A. W. F. (1972). Likelihood. Cambridge Univ. Press. · Zbl 0231.62005
[8] Efron, B. and Hinkley, D. V. (1978). Assessing the accuracy of the maximum likelihood estimator: observed versus expected Fisher information. Biometrika 65 457-487. JSTOR: · Zbl 0401.62002
[9] Fitzmaurice, G. M., Laird, N. M. and Lipsitz, S. R. (1994). Analy sing incomplete longitudinal binary responses: A likelihoodbased approach. Biometrics 50 601-612. JSTOR: · Zbl 0825.62776
[10] Heitjan, D. F. (1993). Estimation with missing data. Letter to the Editor. Biometrics 49 580.
[11] Heitjan, D. F. (1994). Ignorability in general incomplete-data models. Biometrika 81 701-708. JSTOR: · Zbl 0810.62008
[12] Jennrich, R. I. and Schluchter, M. D. (1986). Unbalanced repeated-measures models with structured covariance matrices. Biometrics 42 805-820. JSTOR: · Zbl 0625.62052
[13] Kenward, M. G., Lesaffre, E. and Molenberghs, G. (1994). An application of maximum likelihood and estimating equations to the analysis of ordinal data from a longitudinal study with cases missing at random. Biometrics 50 945-953. · Zbl 0825.62797
[14] Laird, N. M. (1988). Missing data in longitudinal studies. Statistics in Medicine 7 305-315.
[15] Little, R. J. A. (1976). Inference about means for incomplete multivariate data. Biometrika 63 593-604. JSTOR: · Zbl 0344.62049
[16] Little, R. J. A. and Rubin, D. B. (1987). Statistical Analy sis with Missing Data. Wiley, New York. · Zbl 0665.62004
[17] Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. J. Roy. Statist. Soc. Ser. B 44 226-233. JSTOR: · Zbl 0488.62018
[18] Meilijson, I. (1989). A fast improvement to the EM algorithm on its own terms. J. Roy. Statist. Soc. Ser. B 51 127-138. JSTOR: · Zbl 0674.65118
[19] Meng, X.-L. and Rubin, D. B. (1991). Using EM to obtain asy mptotic variance-covariance matrices: the SEM algorithm. J. Amer. Statist. Assoc. 86 899-909.
[20] Molenberghs, G., Kenward, M. G. and Lesaffre, E. (1997). The analysis of longitudinal ordinal data with informative dropout. Biometrika 84 33-44. · Zbl 0883.62120
[21] Molenberghs G. and Lesaffre, E. (1994). Marginal modeling of correlated ordinal data using an n-way Placket distribution. J. Amer. Statist. Assoc. 89 633-644. · Zbl 0802.62063
[22] Murray, G. D. and Findlay, J. G. (1988). Correcting for the bias caused by drop-outs in hy pertension trials. Statistics in Medicine 7 941-946.
[23] Patel, H. I. (1991). Analy sis of incomplete data from clinical trials with repeated measurements. Biometrika 78 609-619. JSTOR: · Zbl 0737.62065
[24] Rubin, D. B. (1976). Inference and missing data. Biometrika 63 581-592. JSTOR: · Zbl 0344.62034
[25] Welsh, A. H. (1996). Aspects of Statistical Inference. Wiley, New York. · Zbl 0939.62003
[26] Woolson, R. F. and Clarke, W. R. (1984). Analy sis of categorical incomplete longitudinal data. J. Roy. Statist. Soc. Ser. A 147 87-99.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.