×

Efficiency of delayed-acceptance random walk metropolis algorithms. (English) Zbl 1478.60093

Summary: Delayed-acceptance Metropolis-Hastings and delayed-acceptance pseudo-marginal Metropolis-Hastings algorithms can be applied when it is computationally expensive to calculate the true posterior or an unbiased stochastic approximation thereof, but a computationally cheap deterministic approximation is available. An initial accept-reject stage uses the cheap approximation for computing the Metropolis-Hastings ratio; proposals which are accepted at this stage are subjected to a further accept-reject step, which corrects for the error in the approximation. Since the expensive posterior, or the approximation thereof, is only evaluated for proposals which are accepted at the first stage, the cost of the algorithm is reduced and larger scalings may be used.
We focus on the random walk Metropolis (RWM) and consider the delayed-acceptance RWM and the delayed-acceptance pseudo-marginal RWM. We provide a framework for incorporating relatively general deterministic approximations into the theoretical analysis of high-dimensional targets. Justified by diffusion-approximation arguments, we derive expressions for the limiting efficiency and acceptance rates in high-dimensional settings. Finally, these theoretical insights are leveraged to formulate practical guidelines for the efficient tuning of the algorithms. The robustness of these guidelines and predicted properties are verified against simulation studies, all of which are strictly outside of the domain of validity of our limit results.

MSC:

60F05 Central limit and other weak theorems
65C05 Monte Carlo methods
65C40 Numerical analysis or methods applied to Markov chains
PDFBibTeX XMLCite
Full Text: DOI arXiv Link

References:

[1] Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 269-342. · Zbl 1411.65020 · doi:10.1111/j.1467-9868.2009.00736.x
[2] Andrieu, C. and Roberts, G. O. (2009). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist. 37 697-725. · Zbl 1185.60083 · doi:10.1214/07-AOS574
[3] Au, S.-K. and Beck, J. L. (2001). Estimation of small failure probabilities in high dimensions by subset simulation. Probab. Eng. Mech. 16 263-277. · doi:10.1016/S0266-8920(01)00019-4
[4] Banterle, M., Grazian, C., Lee, A. and Robert, C. P. (2019). Accelerating Metropolis-Hastings algorithms by delayed acceptance. Foundations of Data Science 1 103-128.
[5] Beaumont, M. A. (2003). Estimation of population growth or decline in genetically monitored populations. Genetics 164 1139-1160.
[6] Bédard, M. (2007). Weak convergence of Metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17 1222-1244. · Zbl 1144.60016 · doi:10.1214/105051607000000096
[7] Bédard, M., Douc, R. and Moulines, E. (2012). Scaling analysis of multiple-try MCMC methods. Stochastic Process. Appl. 122 758-786. · Zbl 1239.60075 · doi:10.1016/j.spa.2011.11.004
[8] Bédard, M. and Rosenthal, J. S. (2008). Optimal scaling of Metropolis algorithms: Heading toward general target distributions. Canad. J. Statist. 36 483-503. · Zbl 1167.60344 · doi:10.1002/cjs.5550360401
[9] Bérard, J., Del Moral, P. and Doucet, A. (2014). A lognormal central limit theorem for particle approximations of normalizing constants. Electron. J. Probab. 19 no. 94, 28. · Zbl 1308.65014 · doi:10.1214/EJP.v19-3428
[10] Beskos, A., Roberts, G. and Stuart, A. (2009). Optimal scalings for local Metropolis-Hastings chains on nonproduct targets in high dimensions. Ann. Appl. Probab. 19 863-898. · Zbl 1172.60328 · doi:10.1214/08-AAP563
[11] Boys, R. J., Wilkinson, D. J. and Kirkwood, T. B. L. (2008). Bayesian inference for a discretely observed stochastic kinetic model. Stat. Comput. 18 125-135. · doi:10.1007/s11222-007-9043-x
[12] Breyer, L. A., Piccioni, M. and Scarlatti, S. (2004). Optimal scaling of MaLa for nonlinear regression. Ann. Appl. Probab. 14 1479-1505. · Zbl 1048.62062 · doi:10.1214/105051604000000369
[13] Brooks, S., Gelman, A., Jones, G. L. and Meng, X.-L., eds. (2011) Handbook of Markov chain Monte Carlo. Chapman & Hall/CRC Handbooks of Modern Statistical Methods. CRC Press, Boca Raton, FL. · Zbl 1218.65001
[14] Catanach, T. A. and Beck, J. L. (2018). Bayesian updating and uncertainty quantification using sequential tempered MCMC with the rank-one modified Metropolis algorithm. ArXiv E-prints. Available at arXiv:1804.08738.
[15] Christen, J. A. and Fox, C. (2005). Markov chain Monte Carlo using an approximation. J. Comput. Graph. Statist. 14 795-810. · doi:10.1198/106186005X76983
[16] Cui, T., Fox, C. and O’Sullivan, M. J. (2011). Bayesian calibration of a large-scale geothermal reservoir model by a new adaptive delayed acceptance Metropolis Hastings algorithm. Water Resour. Res. 47.
[17] Dahlin, J. and Schön, T. B. (2019). Getting started with particle Metropolis-Hastings for inference in nonlinear dynamical models. ArXiv E-prints. Available at arXiv:1511.01707.
[18] Del Moral, P. (2004). Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications. Probability and Its Applications (New York). Springer, New York. · Zbl 1130.60003 · doi:10.1007/978-1-4684-9393-1
[19] Doucet, A., Pitt, M. K., Deligiannidis, G. and Kohn, R. (2015). Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Biometrika 102 295-313. · Zbl 1452.62055 · doi:10.1093/biomet/asu075
[20] Efendiev, Y., Datta-Gupta, A., Ginting, V., Ma, X. and Mallick, B. (2005). An efficient two-stage Markov chain Monte Carlo method for dynamic data integration. Water Resour. Res. 41.
[21] Efendiev, Y., Hou, T. and Luo, W. (2006). Preconditioning Markov chain Monte Carlo simulations using coarse-scale models. SIAM J. Sci. Comput. 28 776-803. · Zbl 1111.65003 · doi:10.1137/050628568
[22] Everitt, R. G. and Rowińska, P. A. (2017). Delayed acceptance ABC-SMC. ArXiv E-prints. Available at arXiv:1708.02230.
[23] Fearnhead, P. and Sherlock, C. (2006). An exact Gibbs sampler for the Markov-modulated Poisson process. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 767-784. · Zbl 1110.62131 · doi:10.1111/j.1467-9868.2006.00566.x
[24] Filippone, M. and Girolami, M. (2014). Pseudo-marginal Bayesian inference for Gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. 36 2214-2226.
[25] Flury, T. and Shephard, N. (2011). Bayesian inference based only on simulated likelihood: Particle filter analysis of dynamic economic models. Econometric Theory 27 933-956. · Zbl 1226.62021 · doi:10.1017/S0266466610000599
[26] Franks, J. and Vihola, M. (2020). Importance sampling correction versus standard averages of reversible MCMCs in terms of the asymptotic variance. Stochastic Process. Appl. 130 6157-6183. · Zbl 1455.60099 · doi:10.1016/j.spa.2020.05.006
[27] Gilks, W. R., Richardson, S. and Spiegelhalter, D. J., eds. (1996) Markov Chain Monte Carlo in Practice. Interdisciplinary Statistics. CRC Press, London. · Zbl 0832.00018 · doi:10.1007/978-1-4899-4485-6
[28] Golightly, A., Henderson, D. A. and Sherlock, C. (2015). Delayed acceptance particle MCMC for exact inference in stochastic kinetic models. Stat. Comput. 25 1039-1055. · Zbl 1332.60117 · doi:10.1007/s11222-014-9469-x
[29] Golightly, A. and Wilkinson, D. J. (2011). Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo. Interface focus 1 807-820.
[30] Higdon, D. C., Reese, S. J., Moulton, D., Vrugt, J. A. and Fox, C. (2011). Posterior exploration for computationally intensive forward models. In Handbook of Markov chain Monte Carlo, chapter 16 (S. Brooks, A. Gelman, G. L. Jones and X.-L. Meng, eds.) 401-418. CRC Press, Boca Raton, FL. · Zbl 1229.65016
[31] Kaipio, J. and Somersalo, E. (2005). Statistical and Computational Inverse Problems. Applied Mathematical Sciences 160. Springer, New York. · Zbl 1068.65022
[32] Knape, J. and de Valpine, P. (2012). Fitting complex population models by combining particle filters with Markov chain Monte Carlo. Ecology 93 256-263.
[33] Liu, J. S. (2001). Monte Carlo Strategies in Scientific Computing. Springer Series in Statistics. Springer, New York. · Zbl 0991.65001
[34] Mattingly, J. C., Pillai, N. S. and Stuart, A. M. (2012). Diffusion limits of the random walk Metropolis algorithm in high dimensions. Ann. Appl. Probab. 22 881-930. · Zbl 1254.60081 · doi:10.1214/10-AAP754
[35] Moulton, J. D., Fox, C. and Svyatskiy, D. (2008). Multilevel approximations in sample-based inversion from the Dirichlet-to-Neumann map. J. Phys., Conf. Ser. 124.
[36] Pasarica, C. and Gelman, A. (2010). Adaptively scaling the Metropolis algorithm using expected squared jumped distance. Statist. Sinica 20 343-364. · Zbl 1186.62038
[37] Pitt, M. K., dos Santos Silva, R., Giordani, P. and Kohn, R. (2012). On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econometrics 171 134-151. · Zbl 1443.62499 · doi:10.1016/j.jeconom.2012.06.004
[38] Quiroz, M., Tran, M.-N., Villani, M. and Kohn, R. (2018). Speeding up MCMC by delayed acceptance and data subsampling. J. Comput. Graph. Statist. 27 12-22. · Zbl 07498963 · doi:10.1080/10618600.2017.1307117
[39] Roberts, G. O., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110-120. · Zbl 0876.60015 · doi:10.1214/aoap/1034625254
[40] Roberts, G. O. and Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 60 255-268. · Zbl 0913.60060 · doi:10.1111/1467-9868.00123
[41] Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various Metropolis-Hastings algorithms. Statist. Sci. 16 351-367. · Zbl 1127.65305 · doi:10.1214/ss/1015346320
[42] Roberts, G. O. and Rosenthal, J. S. (2014). Complexity bounds for Markov chain Monte Carlo algorithms via diffusion limits. J. Appl. Probab. 53 410-420. · Zbl 1345.60082
[43] Roberts, G. O. and Rosenthal, J. S. (2014). Minimising MCMC variance via diffusion limits, with an application to simulated tempering. Ann. Appl. Probab. 24 131-149. · Zbl 1298.60078 · doi:10.1214/12-AAP918
[44] Schmon, S. M., Deligiannidis, G., Doucet, A. and Pitt, M. K. (2021). Large-sample asymptotics of the pseudo-marginal method. Biometrika 108 37-51. · Zbl 1462.62072
[45] Sherlock, C. (2013). Optimal scaling of the random walk Metropolis: General criteria for the 0.234 acceptance rule. J. Appl. Probab. 50 1-15. · Zbl 1266.60062 · doi:10.1239/jap/1363784420
[46] Sherlock, C. (2016). Optimal scaling for the pseudo-marginal random walk Metropolis: Insensitivity to the noise generating mechanism. Methodol. Comput. Appl. Probab. 18 869-884. · Zbl 1361.65003 · doi:10.1007/s11009-015-9471-6
[47] Sherlock, C., Fearnhead, P. and Roberts, G. O. (2010). The random walk Metropolis: Linking theory and practice through a case study. Statist. Sci. 25 172-190. · Zbl 1328.60177 · doi:10.1214/10-STS327
[48] Sherlock, C., Golightly, A. and Henderson, D. A. (2017). Adaptive, delayed-acceptance MCMC for targets with expensive likelihoods. J. Comput. Graph. Statist. 26 434-444. · doi:10.1080/10618600.2016.1231064
[49] Sherlock, C. and Lee, A. (2017). Variance bounding of delayed-acceptance kernels. ArXiv E-prints. Available at arXiv:1706.02142.
[50] Sherlock, C. and Roberts, G. (2009). Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets. Bernoulli 15 774-798. · Zbl 1215.60047 · doi:10.3150/08-BEJ176
[51] Sherlock, C., Thiery, A. H and Golightly, A. (2021). Supplement to “Efficiency of delayed-acceptance random walk Metropolis algorithms.” https://doi.org/10.1214/21-AOS2068SUPP
[52] Sherlock, C., Thiery, A. H. and Lee, A. (2017). Pseudo-marginal Metropolis-Hastings sampling using averages of unbiased estimators. Biometrika 104 727-734. · Zbl 07072238 · doi:10.1093/biomet/asx031
[53] Sherlock, C., Thiery, A. H., Roberts, G. O. and Rosenthal, J. S. (2015). On the efficiency of pseudo-marginal random walk Metropolis algorithms. Ann. Statist. 43 238-275. · Zbl 1326.65015 · doi:10.1214/14-AOS1278
[54] Smith, M. E. (2011). Estimating nonlinear economic models using surrogate transitions. Available from https://files.nyu.edu/mes473/public/Smith_Surrogate.pdf.
[55] Stuart, A. M. (2010). Inverse problems: A Bayesian perspective. Acta Numer. 19 451-559. · Zbl 1242.65142 · doi:10.1017/S0962492910000061
[56] Vihola, M., Helske, J. and Franks, J. (2016). Importance sampling type estimators based on approximate marginal MCMC. Availabe at https://doi.org/10.1111/sjos.12492. · Zbl 1467.62032
[57] Yang, J., Roberts, G. O. and Rosenthal, J. S. (2020). Optimal scaling of random-walk Metropolis algorithms on general target distributions. Stochastic Process. Appl. 130 6094-6132. · Zbl 1455.60100 · doi:10.1016/j.spa.2020.05.004
[58] Zanella, G., Bédard, M. and Kendall, W. S. (2017). A Dirichlet form approach to MCMC optimal scaling. Stochastic Process. Appl. 127 4053-4082 · Zbl 1374.60030 · doi:10.1016/j.spa.2017.03.021
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.