×

Fully conditional specification in multivariate imputation. (English) Zbl 1144.62332

Summary: The use of the Gibbs sampler with fully conditionally specified models, where the distribution of each variable given the other variables is the starting point, has become a popular method to create imputations in incomplete multivariate data. The theoretical weakness of this approach is that the specified conditional densities can be incompatible, and therefore the stationary distribution to which the Gibbs sampler attempts to converge may not exist.
This study investigates practical consequences of this problem by means of simulations. Missing data are created under four different missing data mechanisms. Attention is given to the statistical behavior under compatible and incompatible models. The results indicate that multiple imputation produces essentially unbiased estimates with appropriate coverage in the simple cases investigated, even for the incompatible models. Of particular interest is that these results were produced using only five Gibbs iterations starting from a simple draw from observed marginal distributions. It thus appears that, despite the theoretical weaknesses, the actual performance of conditional model specification for multivariate imputation can be quite good, and therefore deserves further study.

MSC:

62H99 Multivariate analysis
65C60 Computational problems in statistics (MSC2010)

Software:

MICE; IVEware; Stata
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Little R. J.A., Statistical Analysis with Missing Data, 2. ed. (2002) · Zbl 1011.62004
[2] DOI: 10.1201/9781439821862
[3] DOI: 10.1002/9780470316696
[4] Rubin D. B., Multiple Imputation for Nonresponse in Surveys (Reprint) (2004) · Zbl 1070.62007
[5] DOI: 10.2307/2291635 · Zbl 0869.62014
[6] DOI: 10.1214/ss/1009213728 · Zbl 1059.62511
[7] Buck S. F., Journal of the Royal Statistical Society B 22 pp 302– (1960)
[8] DOI: 10.1007/BF02291569 · Zbl 0311.62030
[9] DOI: 10.1007/BF02296204 · Zbl 0427.62036
[10] DOI: 10.1177/0013164487471002
[11] DOI: 10.2307/270953
[12] DOI: 10.1207/S15328007SEM0703_1
[13] Kennickell, A. B. Imputation of the 1989 survey of consumer finances: stochastic relaxation and multiple imputation. ASA 1991. Proceedings of the Section on Survey Research Methods. pp.1–10. Alexandria: ASA.
[14] Raghunathan, T. E., Solenberger, P. and van Hoewyk, J. 2000. ”IVEware: imputation and variance estimation software: installation instructions and user guide”. Survey Research Center, Institute of Social Research, University of Michigan. Available online at:http://www.isr.umich.edu/src/smp/ive/
[15] Brand, J. P.L. 1999. ”Development, implementation and evaluation of multiple imputation strategies for the statistical analysis of incomplete data sets”. Erasmus University Rotterdam. Dissertation
[16] Van Buuren S., Bulletin of the International Statistical Institute, Contributed Papers II pp 503– (1993)
[17] Van Buuren, S. and Oudshoorn, C. G.M. 2000. ”Multivariate imputation by chained equations: MICE V1.0 User’s manual. Report PG/VGZ/00.038”. Leiden: TNO Preventie en Gezondheid.
[18] Royston P., The Stata Journal 4 pp 227– (2004)
[19] DOI: 10.1111/1467-9574.00217 · Zbl 04575108
[20] Kennickell A. B., Record Linkage Techniques 1997 pp 248– (1999)
[21] DOI: 10.1002/(SICI)1097-0258(19990330)18:6<681::AID-SIM71>3.0.CO;2-R
[22] DOI: 10.2307/2986092 · Zbl 0897.62122
[23] Oudshoorn, C. G.M., van Buuren, S. and van Rijckevorsel, J. L.A. 1999. ”Flexible multiple imputation by chained equations of the AVO-95 survey”. Leiden: TNO Prevention and Health. Report PG/VGZ/99.045 Available online at:http://www.multiple-imputation.com
[24] Heeringa S. G., Survey Nonresponse pp 357– (2002)
[25] DOI: 10.1016/S0895-4356(01)00433-4
[26] DOI: 10.1198/000313001317098266 · Zbl 05680456
[27] Raghunathan T. E., Survey Methodology 27 pp 85– (2001)
[28] DOI: 10.1111/1467-9574.00219 · Zbl 04575110
[29] DOI: 10.2307/2289858 · Zbl 0676.62011
[30] Bhattacharryya A., Sankhya 6 pp 399– (1943)
[31] Besag J., Journal of the Royal Statistical Society, Series B 36 pp 192– (1974)
[32] Arnold B. C., Conditional Specification of Statistical Models (1999) · Zbl 0932.62001
[33] DOI: 10.2307/2347679 · Zbl 04550834
[34] DOI: 10.1093/biomet/63.3.581 · Zbl 0344.62034
[35] DOI: 10.1002/0471722146
[36] DOI: 10.1214/ss/1177011136 · Zbl 1386.65060
[37] DOI: 10.2307/2289716
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.