Monte Carlo likelihood inference for missing data models. (English) Zbl 1124.62009

Summary: We describe a Monte Carlo method to approximate the maximum likelihood estimate (MLE), when there are missing data and the observed data likelihood is not available in closed form. This method uses simulated missing data that are independent and identically distributed and independent of the observed data. Our Monte Carlo approximation to the MLE is a consistent and asymptotically normal estimate of the minimizer \(\theta^*\) of the Kullback-Leibler information, as both Monte Carlo and observed data sample sizes go to infinity simultaneously. Plug-in estimates of the asymptotic variance are provided for constructing confidence regions for \(\theta^*\). We give logit-normal generalized linear mixed model examples, calculated using an \(R\) package.


62F12 Asymptotic properties of parametric estimators
65C05 Monte Carlo methods
62J12 Generalized linear models (logistic models)


bernor; R
Full Text: DOI arXiv


[1] Aliprantis, C. D. and Border, K. C. (1999). Infinite Dimensional Analysis . A Hitchhiker ’ s Guide , 2nd ed. Springer, Berlin. · Zbl 0938.46001
[2] Attouch, H. (1984). Variational Convergence for Functions and Operators . Pitman, Boston. · Zbl 0561.49012
[3] Aubin, J.-P. and Frankowska, H. (1990). Set-Valued Analysis . Birkhäuser, Boston. · Zbl 0713.49021
[4] Billingsley, P. (1999). Convergence of Probability Measures , 2nd ed. Wiley, New York. · Zbl 0944.60003
[5] Booth, J. G. and Hobert, J. P. (1999). Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 265–285. · Zbl 0917.62058
[6] Breslow, N. E. and Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. J. Amer. Statist. Assoc. 88 9–25. · Zbl 0775.62195
[7] Coull, B. A. and Agresti, A. (2000). Random effects modeling of multiple binomial responses using the multivariate binomial logit–normal distribution. Biometrics 56 73–80. · Zbl 1060.62533
[8] Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1–38. JSTOR: · Zbl 0364.62022
[9] Ferguson, T. S. (1996). A Course in Large Sample Theory . Chapman and Hall, London. · Zbl 0871.62002
[10] Gelfand, A. E. and Carlin, B. P. (1993). Maximum-likelihood estimation for constrained- or missing-data models. Canad. J. Statist. 21 303–311. JSTOR: · Zbl 0785.62058
[11] Geyer, C. J. (1994). On the asymptotics of constrained \(M\)-estimation. Ann. Statist. 22 1993–2010. · Zbl 0829.62029
[12] Geyer, C. J. (1994). On the convergence of Monte Carlo maximum likelihood calculations. J. Roy. Statist. Soc. Ser. B 56 261–274. JSTOR: · Zbl 0784.62019
[13] Geyer, C. J. and Thompson, E. A. (1995). Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Amer. Statist. Assoc. 90 909–920. · Zbl 0850.62834
[14] Guo, S. W. and Thompson, E. A. (1994). Monte Carlo estimation of mixed models for large complex pedigrees. Biometrics 50 417–432. · Zbl 0821.62075
[15] Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proc. Fifth Berkeley Sympos. Math. Statist. Probab. 1 221–233. Univ. California Press, Berkeley. · Zbl 0212.21504
[16] Karim, M. R. and Zeger, S. L. (1992). Generalized linear models with random effects: Salamander mating revisited. Biometrics 48 631–644.
[17] Kong, A., Liu, J. S. and Wong, W. H. (1994). Sequential imputations and Bayesian missing data problems. J. Amer. Statist. Assoc. 89 278–288. · Zbl 0800.62166
[18] Lange, K. and Sobel, E. (1991). A random walk method for computing genetic location scores. Amer. J. Human Genetics 49 1320–1334.
[19] Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22. JSTOR: · Zbl 0595.62110
[20] Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data , 2nd ed. Wiley, Hoboken, NJ. · Zbl 1011.62004
[21] McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models , 2nd ed. Chapman and Hall, London. · Zbl 0588.62104
[22] McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. J. Amer. Statist. Assoc. 92 162–170. JSTOR: · Zbl 0889.62061
[23] Moyeed, R. A. and Baddeley, A. J. (1991). Stochastic approximation of the MLE for a spatial point pattern. Scand. J. Statist. 18 39–50. · Zbl 0723.62047
[24] Ott, J. (1979). Maximum likelihood estimation by counting methods under polygenic and mixed models in human pedigrees. Amer. J. Human Genetics 31 161–175.
[25] Penttinen, A. (1984). Modelling interaction in spatial point patterns: Parameter estimation by the maximum likelihood method. Jyväskylä Studies in Computer Science, Economics and Statistics No. 7. Univ. Jyväskylä, Finland.
[26] Rockafellar, R. T. and Wets, R. J.-B. (1998). Variational Analysis . Springer, Berlin. · Zbl 0888.49001
[27] Sung, Y. J. (2003). Model misspecification in missing data. Ph.D. dissertation, Univ. Minnesota.
[28] Thompson, E. A. (2003). Linkage analysis. In Handbook of Statistical Genetics , 2nd ed. (D. J. Balding, M. Bishop and C. Cannings, eds.) 893–918. Wiley, Chichester.
[29] Thompson, E. A. and Guo, S. W. (1991). Evaluation of likelihood ratios for complex genetic models. IMA J. Mathematics Applied in Medicine and Biology 8 149–169. · Zbl 0739.62082
[30] Torrie, G. M. and Valleau, J. P. (1977). Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys. 23 187–199.
[31] van der Vaart, A. W. (1998). Asymptotic Statistics . Cambridge Univ. Press. · Zbl 0910.62001
[32] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes : With Applications to Statistics . Springer, New York. · Zbl 0862.60002
[33] Wald, A. (1949). Note on the consistency of the maximum likelihood estimate. Ann. Math. Statist. 20 595–601. · Zbl 0034.22902
[34] Wei, G. C. G. and Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J. Amer. Statist. Assoc. 85 699–704.
[35] White, H. A. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50 1–25. JSTOR: · Zbl 0478.62088
[36] Wijsman, R. A. (1964). Convergence of sequences of convex sets, cones and functions. Bull. Amer. Math. Soc. 70 186–188. · Zbl 0121.39001
[37] Wijsman, R. A. (1966). Convergence of sequences of convex sets, cones and functions. II. Trans. Amer. Math. Soc. 123 32–45. · Zbl 0146.18204
[38] Younes, L. (1988). Estimation and annealing for Gibbsian fields. Ann. Inst. H. Poincaré Probab. Statist. 24 269–294. · Zbl 0651.62091
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.