×

Diffusion limits of the random walk Metropolis algorithm in high dimensions. (English) Zbl 1254.60081

The authors establish a diffusion limit for a random walk Metropolis-Hastings (RWM) algorithm approximating samples from a measure on an infinite-dimensional space, more precisely, for the RWM algorithm applied to distribution families obtained as finite-dimensional approximations to a measure on an separable Hilbert space. In general, diffusion limits are used as a tool to measure the complexity of MCMC methods which have been applied to high-dimensional target measures with product structure. In that case, the individual components of the Markov chain satisfy an invariance principle with respect to a scalar stochastic differential equation. The present paper extends this approach to more general high-dimensional target measures naturally arising in applications. The established invariance principle is such that the entire Markov chain converges to an infinite-dimensional continuous-time stochastic process given by a Hilbert-valued stochastic differential equation, i.e., a stochastic partial differential equations (SPDE).
In more detail, the authors consider target measures \(\pi\) on a real separable Hilbert space \(H\) which possess a Radon-Nikodym derivative with respect to a Gaussian measure \(\pi_0\) on \(H\) of the form \[ \frac{d\pi}{d\pi_0}=M_\Psi\exp(-\Psi(x)) \] for some real, measurable functional \(\Psi\). Then, a realisable implementation (necessarily in finite dimensions) of the RWM algorithm for \(\pi\) is obtained by applying an RWM algorithm for a projection \(\pi^N\) of \(\pi\) on an \(N\)-dimensional subspace of \(H\). The main result of the paper is that started in stationarity the piecewise constant linear interpolants of the Markov chains in the \(N\)-dimensional subspaces converge for \(N\to\infty\) weakly to the solution of an SPDE in a suitable subspace of \(H\) started at \(\pi\) which is the invariant measure for this SPDE. This is proved under a list of assumptions on the function \(\Psi\) and the trace class covariance operator of the Gaussian measure \(\pi_0\) which according to the authors are satisfied in many applications. The practical implications of this result are that at stationarity the work to explore that state space scales as \(\mathcal{O}(N)\) and the speed at which the invariant measure is explored can be maximized by tuning the acceptance rate probability to \(0.234\) analogous to the known case for measures of product structure.

MSC:

60J22 Computational methods in Markov chains
60J05 Discrete-time Markov processes on general state spaces
60F05 Central limit and other weak theorems
65C05 Monte Carlo methods
65C40 Numerical analysis or methods applied to Markov chains
60H15 Stochastic partial differential equations (aspects of stochastic analysis)
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Bédard, M. (2007). Weak convergence of Metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17 1222-1244. · Zbl 1144.60016 · doi:10.1214/105051607000000096
[2] Bédard, M. (2009). On the optimal scaling problem of Metropolis algorithms for hierarchical target distributions.
[3] Berger, E. (1986). Asymptotic behaviour of a class of stochastic approximation procedures. Probab. Theory Related Fields 71 517-552. · Zbl 0571.62073 · doi:10.1007/BF00699040
[4] Beskos, A., Roberts, G. and Stuart, A. (2009). Optimal scalings for local Metropolis-Hastings chains on nonproduct targets in high dimensions. Ann. Appl. Probab. 19 863-898. · Zbl 1172.60328 · doi:10.1214/08-AAP563
[5] Beskos, A., Roberts, G., Stuart, A. and Voss, J. (2008). MCMC methods for diffusion bridges. Stoch. Dyn. 8 319-350. · Zbl 1159.65007 · doi:10.1142/S0219493708002378
[6] Beskos, A. and Stuart, A. M. (2008). MCMC methods for sampling function space. In ICIAM Invited Lecture 2007 (R. Jeltsch and G. Wanner, eds.). European Mathematical Society, Zürich. · Zbl 1179.65003
[7] Bou-Rabee, N. and Vanden-Eijnden, E. (2010). Pathwise accuracy and ergodicity of Metropolized integrators for SDEs. Comm. Pure Appl. Math. 63 655-696. · Zbl 1214.60031 · doi:10.1002/cpa.20306
[8] Breyer, L. A., Piccioni, M. and Scarlatti, S. (2004). Optimal scaling of MALA for nonlinear regression. Ann. Appl. Probab. 14 1479-1505. · Zbl 1048.62062 · doi:10.1214/105051604000000369
[9] Breyer, L. A. and Roberts, G. O. (2000). From Metropolis to diffusions: Gibbs states and optimal scaling. Stochastic Process. Appl. 90 181-206. · Zbl 1047.60065 · doi:10.1016/S0304-4149(00)00041-7
[10] Chatterjee, S. (2007). Stein’s method. Lecture notes. Available at . · Zbl 1116.60056
[11] Chen, X. and White, H. (1998). Central limit and functional central limit theorems for Hilbert-valued dependent heterogeneous arrays with applications. Econometric Theory 14 260-284.
[12] Cotter, S. L., Dashti, M. and Stuart, A. M. (2010). Approximation of Bayesian inverse problems. SIAM Journal of Numerical Analysis 48 322-345. · Zbl 1210.35284 · doi:10.1137/090770734
[13] Da Prato, G. and Zabczyk, J. (1992). Stochastic Equations in Infinite Dimensions. Encyclopedia of Mathematics and Its Applications 44 . Cambridge Univ. Press, Cambridge. · Zbl 0761.60052 · doi:10.1017/CBO9780511666223
[14] Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes : Characterization and Convergence . Wiley, New York. · Zbl 0592.60049
[15] Hairer, M., Stuart, A. M. and Voss, J. (2007). Analysis of SPDEs arising in path sampling. II. The nonlinear case. Ann. Appl. Probab. 17 1657-1706. · Zbl 1140.60329 · doi:10.1214/07-AAP441
[16] Hairer, M., Stuart, A. M. and Voss, J. (2011). Signal processing problems on function space: Bayesian formulation, stochastic PDEs and effective MCMC methods. In The Oxford Handbook of Nonlinear Filtering (D. Crisan and B. Rozovsky, eds.). Oxford Univ. Press, Oxford. · Zbl 1262.94012
[17] Hairer, M., Stuart, A. M., Voss, J. and Wiberg, P. (2005). Analysis of SPDEs arising in path sampling. I. The Gaussian case. Commun. Math. Sci. 3 587-603. · Zbl 1138.60326 · doi:10.4310/CMS.2005.v3.n4.a8
[18] Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97-109. · Zbl 0219.65008 · doi:10.1093/biomet/57.1.97
[19] Liu, J. S. (2008). Monte Carlo Strategies in Scientific Computing . Springer, New York. · Zbl 1132.65003
[20] Ma, Z. M. and Röckner, M. (1992). Introduction to the Theory of ( nonsymmetric ) Dirichlet Forms . Springer, Berlin.
[21] Metropolis, N., Rosenbluth, A. W., Teller, M. N. and Teller, E. (1953). Equations of state calculations by fast computing machines. J. Chem. Phys. 21 1087-1092.
[22] Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods , 2nd ed. Springer, New York. · Zbl 1096.62003
[23] Roberts, G. O., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7 110-120. · Zbl 0876.60015 · doi:10.1214/aoap/1034625254
[24] Roberts, G. O. and Rosenthal, J. S. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 255-268. · Zbl 0913.60060 · doi:10.1111/1467-9868.00123
[25] Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various Metropolis-Hastings algorithms. Statist. Sci. 16 351-367. · Zbl 1127.65305 · doi:10.1214/ss/1015346320
[26] Stroock, D. W. (1993). Probability Theory , an Analytic View . Cambridge Univ. Press, Cambridge. · Zbl 0925.60004
[27] Stuart, A. M. (2010). Inverse problems: A Bayesian perspective. Acta Numer. 19 451-559. · Zbl 1242.65142 · doi:10.1017/S0962492910000061
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.