×

Improving the convergence properties of the data augmentation algorithm with an application to Bayesian mixture modeling. (English) Zbl 1246.60095

Summary: The reversible Markov chains that drive the data augmentation (DA) and sandwich algorithms define self-adjoint operators whose spectra encode the convergence properties of the algorithms. When the target distribution has uncountable support, as is nearly always the case in practice, it is generally quite difficult to get a handle on these spectra. We show that, if the augmentation space is finite, then (under regularity conditions) the operators defined by the DA and sandwich chains are compact, and the spectra are finite subsets of \([0, 1)\). Moreover, we prove that the spectrum of the sandwich operator dominates the spectrum of the DA operator in the sense that the ordered elements of the former are all less than or equal to the corresponding elements of the latter. As a concrete example, we study a widely used DA algorithm for the exploration of posterior densities associated with Bayesian mixture models [J. Diebolt and C. P. Robert [J. R. Stat. Soc., Ser. B, Stat. Methodol. 56, No. 2, 363–375 (1994; Zbl 0796.62028)]. In particular, we compare this mixture DA algorithm with an alternative algorithm proposed by S. Frühwirth-Schnatter [J. Am. Stat. Assoc. 96, No. 453, 194–209 (2001; Zbl 1015.62022)] that is based on random label switching.

MSC:

60J22 Computational methods in Markov chains
62F15 Bayesian inference
65C05 Monte Carlo methods

References:

[1] Asmussen, S. and Glynn, P. (2011). A new proof of convergence of MCMC via the ergodic theorem. Statist. Probab. Lett. 81 1482-1485. · Zbl 1232.65015 · doi:10.1016/j.spl.2011.05.004
[2] Buja, A. (1990). Remarks on functional canonical variates, alternating least squares methods and ACE. Ann. Statist. 18 1032-1069. · Zbl 0721.62068 · doi:10.1214/aos/1176347739
[3] Celeux, G., Hurn, M. and Robert, C. P. (2000). Computational and inferential difficulties with mixture posterior distributions. J. Amer. Statist. Assoc. 95 957-970. · Zbl 0999.62020 · doi:10.2307/2669477
[4] Diaconis, P., Khare, K. and Saloff-Coste, L. (2008). Gibbs sampling, exponential families and orthogonal polynomials. Statist. Sci. 23 151-178. With comments and a rejoinder by the authors. · Zbl 1327.62058 · doi:10.1214/07-STS252
[5] Diebolt, J. and Robert, C. P. (1994). Estimation of finite mixture distributions through Bayesian sampling. J. Roy. Statist. Soc. Ser. B 56 363-375. · Zbl 0796.62028
[6] Flegal, J. M., Haran, M. and Jones, G. L. (2008). Markov chain Monte Carlo: Can we trust the third significant figure? Statist. Sci. 23 250-260. · Zbl 1327.62017 · doi:10.1214/08-STS257
[7] Frühwirth-Schnatter, S. (2001). Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Amer. Statist. Assoc. 96 194-209. · Zbl 1015.62022 · doi:10.1198/016214501750333063
[8] Hobert, J. P. (2011). The data augmentation algorithm: Theory and methodology. In Handbook of Markov Chain Monte Carlo (S. Brooks, A. Gelman, G. Jones and X.-L. Meng, eds.). Chapman & Hall/CRC Press, Boca Raton, FL. · Zbl 1229.65017
[9] Hobert, J. P. and Marchev, D. (2008). A theoretical comparison of the data augmentation, marginal augmentation and PX-DA algorithms. Ann. Statist. 36 532-554. · Zbl 1155.60031 · doi:10.1214/009053607000000569
[10] Hobert, J. P. and Román, J. C. (2011). Discussion of “To center or not to center: That is not the question-An ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC efficiency,” by Y. Yu and X.-L. Meng. J. Comput. Graph. Statist. 20 571-580.
[11] Jasra, A., Holmes, C. C. and Stephens, D. A. (2005). Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Statist. Sci. 20 50-67. · Zbl 1100.62032 · doi:10.1214/088342305000000016
[12] Jones, G. L. and Hobert, J. P. (2001). Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Statist. Sci. 16 312-334. · Zbl 1127.60309 · doi:10.1214/ss/1015346317
[13] Jones, G. L., Haran, M., Caffo, B. S. and Neath, R. (2006). Fixed-width output analysis for Markov chain Monte Carlo. J. Amer. Statist. Assoc. 101 1537-1547. · Zbl 1171.62316 · doi:10.1198/016214506000000492
[14] Lee, K., Marin, J. M., Mengersen, K. L. and Robert, C. (2008). Bayesian inference on mixtures of distributions. In Platinum Jubilee of the Indian Statistical Institute (N. N. Sastry, ed.). Indian Statistical Institute, Bangalore.
[15] Liu, J. S. and Sabatti, C. (2000). Generalised Gibbs sampler and multigrid Monte Carlo for Bayesian computation. Biometrika 87 353-369. · Zbl 0960.65015 · doi:10.1093/biomet/87.2.353
[16] Liu, J. S., Wong, W. H. and Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81 27-40. · Zbl 0811.62080 · doi:10.1093/biomet/81.1.27
[17] Liu, J. S., Wong, W. H. and Kong, A. (1995). Covariance structure and convergence rate of the Gibbs sampler with various scans. J. Roy. Statist. Soc. Ser. B 57 157-169. · Zbl 0811.60056
[18] Liu, J. S. and Wu, Y. N. (1999). Parameter expansion for data augmentation. J. Amer. Statist. Assoc. 94 1264-1274. · Zbl 1069.62514 · doi:10.2307/2669940
[19] Meng, X.-L. and van Dyk, D. A. (1999). Seeking efficient data augmentation schemes via conditional and marginal augmentation. Biometrika 86 301-320. · Zbl 1054.62505 · doi:10.1093/biomet/86.2.301
[20] Meyn, S. P. and Tweedie, R. L. (1993). Markov Chains and Stochastic Stability . Springer, London. · Zbl 0925.60001
[21] Mira, A. and Geyer, C. J. (1999). Ordering Monte Carlo Markov chains. Technical Report 632, School of Statistics, Univ. Minnesota.
[22] Papaspiliopoulos, O., Roberts, G. O. and Sköld, M. (2007). A general framework for the parametrization of hierarchical models. Statist. Sci. 22 59-73. · Zbl 1246.62195 · doi:10.1214/088342307000000014
[23] Retherford, J. R. (1993). Hilbert Space: Compact Operators and the Trace Theorem. London Mathematical Society Student Texts 27 . Cambridge Univ. Press, Cambridge. · Zbl 0783.47031
[24] Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods , 2nd ed. Springer, New York. · Zbl 1096.62003
[25] Roberts, G. O. and Rosenthal, J. S. (1997). Geometric ergodicity and hybrid Markov chains. Electron. Comm. Probab. 2 13-25 (electronic). · Zbl 0890.60061
[26] Roberts, G. O. and Rosenthal, J. S. (2001). Markov chains and de-initializing processes. Scand. J. Stat. 28 489-504. · Zbl 0985.60067 · doi:10.1111/1467-9469.00250
[27] Rosenthal, J. S. (2003). Asymptotic variance and convergence rates of nearly-periodic Markov chain Monte Carlo algorithms. J. Amer. Statist. Assoc. 98 169-177. · Zbl 1048.60057 · doi:10.1198/016214503388619193
[28] Roy, V. and Hobert, J. P. (2007). Convergence rates and asymptotic standard errors for Markov chain Monte Carlo algorithms for Bayesian probit regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 607-623. · doi:10.1111/j.1467-9868.2007.00602.x
[29] Rudin, W. (1991). Functional Analysis , 2nd ed. McGraw-Hill, New York. · Zbl 0867.46001
[30] Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). J. Amer. Statist. Assoc. 82 528-550. · Zbl 0619.62029 · doi:10.2307/2289457
[31] Tierney, L. (1994). Markov chains for exploring posterior distributions (with discussion). Ann. Statist. 22 1701-1762. · Zbl 0829.62080 · doi:10.1214/aos/1176325750
[32] van Dyk, D. A. and Meng, X.-L. (2001). The art of data augmentation (with discussion). J. Comput. Graph. Statist. 10 1-50. · doi:10.1198/10618600152418584
[33] Voss, H. (2003). Variational characterizations of eigenvalues of nonlinear eigenproblems. In Proceedings of the International Conference on Mathematical and Computer Modelling in Science and Engineering (M. Kocandrlova and V. Kelar, eds.) 379-383. Czech Technical Univ., Prague.
[34] Yu, Y. and Meng, X. L. (2011). To center or not to center: That is not the question-An ancillarity-sufficiency interweaving strategy (ASIS) for boosting MCMC efficiency (with discussion). J. Comput. Graph. Statist. 20 531-570.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.