×

Adaptive MCMC with online relabeling. (English) Zbl 1386.65011

Summary: When targeting a distribution that is artificially invariant under some permutations, Markov chain Monte Carlo (MCMC) algorithms face the label-switching problem, rendering marginal inference particularly cumbersome. Such a situation arises, for example, in the Bayesian analysis of finite mixture models. Adaptive MCMC algorithms such as adaptive Metropolis (AM), which self-calibrates its proposal distribution using an online estimate of the covariance matrix of the target, are no exception. To address the label-switching issue, relabeling algorithms associate a permutation to each MCMC sample, trying to obtain reasonable marginals. In the case of adaptive Metropolis [H. Haario et al., Bernoulli 7, No. 2, 223–242 (2001; Zbl 0989.65004)], an online relabeling strategy is required. This paper is devoted to the AMOR algorithm, a provably consistent variant of AM that can cope with the label-switching problem. The idea is to nest relabeling steps within the MCMC algorithm based on the estimation of a single covariance matrix that is used both for adapting the covariance of the proposal distribution in the Metropolis algorithm step and for online relabeling. We compare the behavior of AMOR to similar relabeling methods. In the case of compactly supported target distributions, we prove a strong law of large numbers for AMOR and its ergodicity. These are the first results on the consistency of an online relabeling algorithm to our knowledge. The proof underlines latent relations between relabeling and vector quantization.

MSC:

65C05 Monte Carlo methods
60J22 Computational methods in Markov chains

Citations:

Zbl 0989.65004

References:

[1] Andrieu, C., Moulines, É. and Priouret, P. (2005). Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim. 44 283-312. · Zbl 1083.62073 · doi:10.1137/S0363012902417267
[2] Andrieu, C. and Robert, C.P. (2011). Controlled Markov chain Monte Carlo methods for optimal sampling. Technical Report 125, Cahiers du Ceremade, Université Paris Dauphine.
[3] Andrieu, C. and Thoms, J. (2008). A tutorial on adaptive MCMC. Stat. Comput. 18 343-373. · doi:10.1007/s11222-008-9110-y
[4] Atchadé, Y., Fort, G., Moulines, E. and Priouret, P. (2011). Adaptive Markov chain Monte Carlo: Theory and methods. In Bayesian Time Series Models 32-51. Cambridge: Cambridge Univ. Press. · Zbl 1246.65003 · doi:10.1017/CBO9780511984679.003
[5] Bardenet, R., Cappé, O., Fort, G. and Kégl, B. (2012). Adaptive Metropolis with online relabeling. In International Conference on Artificial Intelligence and Statistics ( AISTATS ). JMLR Workshop and Conference Proceedings 22 91-99. Microtome Publishing.
[6] Bardenet, R., Cappé, O., Fort, G. and Kégl, B. (2014). Supplement to “Adaptive MCMC with online relabeling.” . · Zbl 1386.65011
[7] Bardenet, R. and Kégl, B. (2012). An adaptive Monte-Carlo Markov chain algorithm for inference from mixture signals. J. Phys. Conf. Ser. 368 012044.
[8] Boyd, S. and Vandenberghe, L. (2004). Convex Optimization . Cambridge: Cambridge Univ. Press. · Zbl 1058.90049
[9] Celeux, G. (1998). Bayesian inference for mixtures: The label-switching problem. In Computational Statistics Symposium ( COMPSTAT ) 227-232. Berlin: Springer. · Zbl 0951.62018
[10] Celeux, G., Hurn, M. and Robert, C.P. (2000). Computational and inferential difficulties with mixture posterior distributions. J. Amer. Statist. Assoc. 95 957-970. · Zbl 0999.62020 · doi:10.2307/2669477
[11] Chen, H.-F. (2002). Stochastic Approximation and Its Applications. Nonconvex Optimization and Its Applications 64 . Dordrecht: Kluwer Academic.
[12] Cron, A.J. and West, M. (2011). Efficient classification-based relabeling in mixture models. Amer. Statist. 65 16-20. · doi:10.1198/tast.2011.10170
[13] Fort, G., Moulines, E. and Priouret, P. (2011). Convergence of adaptive and interacting Markov chain Monte Carlo algorithms. Ann. Statist. 39 3262-3289. · Zbl 1246.65003 · doi:10.1214/11-AOS938
[14] Graf, S. and Luschgy, H. (2000). Foundations of Quantization for Probability Distributions. Lecture Notes in Math. 1730 . Berlin: Springer. · Zbl 0951.60003 · doi:10.1007/BFb0103945
[15] Green, P.J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711-732. · Zbl 0861.62023 · doi:10.1093/biomet/82.4.711
[16] Haario, H., Saksman, E. and Tamminen, J. (2001). An adaptive Metropolis algorithm. Bernoulli 7 223-242. · Zbl 0989.65004 · doi:10.2307/3318737
[17] Jasra, A. (2005). Bayesian inference for mixture models via Monte Carlo. Ph.D. thesis, Imperial College, London, UK.
[18] Jasra, A., Holmes, C.C. and Stephens, D.A. (2005). Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modeling. Statist. Sci. 20 50-67. · Zbl 1100.62032 · doi:10.1214/088342305000000016
[19] Marin, J.-M., Mengersen, K. and Robert, C.P. (2005). Bayesian modelling and inference on mixtures of distributions. In Bayesian Thinking : Modeling and Computation. Handbook of Statist. 25 459-507. Amsterdam: Elsevier. · doi:10.1016/S0169-7161(05)25016-2
[20] Meyn, S.P. and Tweedie, R.L. (1993). Markov Chains and Stochastic Stability. Communications and Control Engineering Series . London: Springer. · Zbl 0925.60001
[21] Pagès, G. (1998). A space quantization method for numerical integration. J. Comput. Appl. Math. 89 1-38. · Zbl 0908.65012 · doi:10.1016/S0377-0427(97)00190-8
[22] Papastamoulis, P. and Iliopoulos, G. (2010). An artificial allocations based solution to the label switching problem in Bayesian analysis of mixtures of distributions. J. Comput. Graph. Statist. 19 313-331. · doi:10.1198/jcgs.2010.09008
[23] Richardson, S. and Green, P.J. (1997). On Bayesian analysis of mixtures with an unknown number of components. J. Roy. Statist. Soc. Ser. B 59 731-792. · Zbl 0891.62020 · doi:10.1111/1467-9868.00095
[24] Robert, C.P. and Casella, G. (2004). Monte Carlo Statistical Methods , 2nd ed. Springer Texts in Statistics . New York: Springer. · Zbl 1096.62003
[25] Roberts, G.O., Gelman, A. and Gilks, W.R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110-120. · Zbl 0876.60015 · doi:10.1214/aoap/1034625254
[26] Roberts, G.O. and Rosenthal, J.S. (2001). Optimal scaling for various Metropolis-Hastings algorithms. Statist. Sci. 16 351-367. · Zbl 1127.65305 · doi:10.1214/ss/1015346320
[27] Roberts, G.O. and Rosenthal, J.S. (2007). Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms. J. Appl. Probab. 44 458-475. · Zbl 1137.62015 · doi:10.1239/jap/1183667414
[28] Roberts, G.O. and Rosenthal, J.S. (2009). Examples of adaptive MCMC. J. Comput. Graph. Statist. 18 349-367. · doi:10.1198/jcgs.2009.06134
[29] Roodaki, A. (2012). Signal decompositions using trans-dimensional Bayesian methods. Ph.D. thesis, Supélec, Gif-sur-Yvette, France.
[30] Roodaki, A., Bect, J. and Fleury, G. (2012). Summarizing posterior distributions in signal decomposition problems when the number of components is unknown. In IEEE International Conference on Acoustics , Speech , Signal Processing ( ICASSP ) 3873-3876. Berlin: Springer. · Zbl 1394.94489
[31] Sperrin, M., Jaki, T. and Wit, E. (2010). Probabilistic relabelling strategies for the label switching problem in Bayesian mixture models. Stat. Comput. 20 357-366. · doi:10.1007/s11222-009-9129-8
[32] Stephens, M. (2000). Dealing with label switching in mixture models. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 795-809. · Zbl 0957.62020 · doi:10.1111/1467-9868.00265
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.