Pillai, Natesh S.; Stuart, Andrew M.; Thiéry, Alexandre H. Noisy gradient flow from a random walk in Hilbert space. (English) Zbl 1308.60005 Stoch. Partial Differ. Equ., Anal. Comput. 2, No. 2, 196-232 (2014). Summary: Consider a probability measure on a Hilbert space defined via its density with respect to a Gaussian. The purpose of this paper is to demonstrate that an appropriately defined Markov chain, which is reversible with respect to the measure in question, exhibits a diffusion limit to a noisy gradient flow, also reversible with respect to the same measure. The Markov chain is defined by applying a Metropolis-Hastings accept-reject mechanism [L. Tierney, Ann. Appl. Probab. 8, No. 1, 1–9 (1998; Zbl 0935.60053)] to an Ornstein-Uhlenbeck (OU) proposal which is itself reversible with respect to the underlying Gaussian measure. The resulting noisy gradient flow is a stochastic partial differential equation driven by a Wiener process with spatial correlation given by the underlying Gaussian structure. There are two primary motivations for this work. The first concerns insight into Monte Carlo Markov Chain (MCMC) methods for sampling of measures on a Hilbert space defined via a density with respect to a Gaussian measure. These measures must be approximated on finite dimensional spaces of dimension \(N\) in order to be sampled. A conclusion of the work herein is that MCMC methods based on prior-reversible OU proposals will explore the target measure in \(\mathcal O(1)\) steps with respect to dimension \(N\). This is to be contrasted with standard MCMC methods based on the random walk or Langevin proposals which require \(\mathcal O(N)\) and \(\mathcal O(N^{1/3})\) steps respectively [J. C. Mattingly et al., Ann. Appl. Probab. 22, No. 3, 881–930 (2012; Zbl 1254.60081); N. S. Pillai et al., ibid. 22, No. 6, 2320–2356 (2012; Zbl 1272.60053)]. The second motivation relates to optimization. There are many applications where it is of interest to find global or local minima of a functional defined on an infinite dimensional Hilbert space. Gradient flow or steepest descent is a natural approach to this problem, but in its basic form requires computation of a gradient which, in some applications, may be an expensive or complex task. This paper shows that a stochastic gradient descent described by a stochastic partial differential equation can emerge from certain carefully specified Markov chains. This idea is well-known in the finite state [S. Kirkpatrick et al., Science 220, No. 4598, 671–680 (1983; Zbl 1225.90162); V. Černý, J. Optimization Theory Appl. 45, 41–51 (1985; Zbl 0534.90091)] or finite dimensional context [S. Geman, “Bayesian image analysis by adaptive annealing”, in: 1985 international geoscience and remote sensing symposium, IGARSS ’85. New York: IEEE. 269–276 (1985); S. Geman and C.-Ruey Hwang, SIAM J. Control Optimization 24, 1031–1043 (1986; Zbl 0602.60071); T.-S. Chiang et al., SIAM J. Control Optimization 25, 737–753 (1987; Zbl 0622.60093)]; R. A. Holley et al., J. Funct. Anal. 83, No. 2, 333–347 (1989; Zbl 0706.58075)].The novelty of the work in this paper is that the emergence of the noisy gradient flow is developed on an infinite dimensional Hilbert space. In the context of global optimization, when the noise level is also adjusted as part of the algorithm, methods of the type studied here go by the name of simulated-annealing; see [D. Bertsimas and J. Tsitsiklis, in: Probability and algorithms, 17–29 (1992; Zbl 0764.60073)] for further references. Although we do not consider adjusting the noise-level as part of the algorithm, the noise strength is a tuneable parameter in our construction and the methods developed here could potentially be used to study simulated annealing in a Hilbert space setting. The transferable idea behind this work is that conceiving of algorithms directly in the infinite dimensional setting leads to methods which are robust to finite dimensional approximation. We emphasize that discretizing, and then applying standard finite dimensional techniques in \(\mathbb R^N\), to either sample or optimize, can lead to algorithms which degenerate as the dimension \(N\) increases. Cited in 8 Documents MSC: 60-08 Computational methods for problems pertaining to probability theory 60H15 Stochastic partial differential equations (aspects of stochastic analysis) 60J25 Continuous-time Markov processes on general state spaces 49J27 Existence theories for problems in abstract spaces 65C40 Numerical analysis or methods applied to Markov chains Keywords:optimisation; simulated annealing; Markov chain Monte Carlo; diffusion approximation Citations:Zbl 0935.60053; Zbl 1254.60081; Zbl 1272.60053; Zbl 1225.90162; Zbl 0534.90091; Zbl 0602.60071; Zbl 0622.60093; Zbl 0706.58075; Zbl 0764.60073 × Cite Format Result Cite Review PDF Full Text: DOI References: [1] Tierney, L.: A note on metropolis-hastings kernels for general state spaces. Ann. Appl. Probab. 8(1), 1–9 (1998) · Zbl 0935.60053 · doi:10.1214/aoap/1027961031 [2] Mattingly, J., Pillai, N., Stuart, A.: SPDE limits of the random walk metropolis algorithm in high dimensions. Ann. Appl. Prob (2011) · Zbl 1254.60081 [3] Pillai, N.S., Stuart, A.M., Thiery, A.H.: Optimal scaling and diffusion limits for the langevin algorithm in high dimensions. Ann. Appl. Prob. 22(6), 2320–2356 (2012). doi: 10.1214/11-AAP828 · Zbl 1272.60053 · doi:10.1214/11-AAP828 [4] Kirkpatrick, S., Jr., D., Vecchi, M.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983) · Zbl 1225.90162 [5] Černỳ, V.: Thermodynamical approach to the traveling salesman problem: an efficient simulation algorithm. J. Optim. Theory Appl. 45(1), 41–51 (1985) · Zbl 0534.90091 · doi:10.1007/BF00940812 [6] Geman, D.: Bayesian image analysis by adaptive annealing. IEEE Trans. Geosci. Remote Sens. 1, 269–276 (1985) [7] Geman, S., Hwang, C.: Diffusions for global optimization. SIAM J. Control Optim. 24, 1031 (1986) · Zbl 0602.60071 · doi:10.1137/0324060 [8] Chiang, T., Hwang, C., Sheu, S.: Diffusion for global optimization in $$\(\backslash\)text{ r }{\(\backslash\)hat{\(\backslash\),}}\(\backslash\)text{ n }$$ r \^ n . SIAM J. Control Optim. 25(3), 737–753 (1987) · Zbl 0622.60093 · doi:10.1137/0325042 [9] Holley, R., Kusuoka, S., Stroock, D.: Asymptotics of the spectral gap with applications to the theory of simulated annealing. J. Funct. Anal. 83(2), 333–347 (1989) · Zbl 0706.58075 · doi:10.1016/0022-1236(89)90023-2 [10] Bertsimas, D., Tsitsiklis, J.: Simulated annealing. Stat. Sci. 8(1), 10–15 (1993) · Zbl 0764.60073 · doi:10.1214/ss/1177011077 [11] Hinze, M., Pinnau, R., Ulbrich, M., Ulbrich, S.: Optimization PDE Constraints. Springer, New York (2008) [12] Da Prato, G., Zabczyk, J.: Ergodicity for Infinite Dimensional Systems. Cambridge Univ Press, Cambridge (1996) · Zbl 0849.60052 [13] Beskos, A., Roberts, G., Stuart, A., Voss, J.: An mcmc method for diffusion bridges. Stoch. Dyn. 8(3), 319–350 (2008) · Zbl 1159.65007 · doi:10.1142/S0219493708002378 [14] Cotter, S., Roberts, G.O., Stuart, A., White, D.: MCMC methods for functions: modifying old algorithms to make them faster. Statistical Science · Zbl 1331.62132 [15] Hairer, M., Stuart, A.M., Voss, J.: Analysis of spdes arising in path sampling. partii: the nonlinear case. Ann. Appl. Probab. 17(5–6), 1657–1706 (2007) · Zbl 1140.60329 · doi:10.1214/07-AAP441 [16] Dashti, M., Law, K., Stuart, A., Voss, J.: MAP estimators and their consistency in bayesian nonparametric inverse problems. Inverse Problems · Zbl 1281.62089 [17] Da Prato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions, Encyclopedia of Mathematics and Its Applications, vol. 44. Cambridge University Press, Cambridge (1992) · Zbl 0761.60052 [18] Stuart, A.: Inverse problems: a bayesian perspective. Acta Numer. 19(–1), 451–559 (2010) · Zbl 1242.65142 · doi:10.1017/S0962492910000061 [19] Hairer, M., Stuart, A.M., Voss, J.: Signal Processing Problems on Function Space: Bayesian Formulation, Stochastic Pdes and Effective Mcmc Methods. The Oxford Handbook of Nonlinear Filtering. In: Crisan, D., Rozovsky, B. (2010). To Appear [20] Cotter, S., Dashti, M., Stuart, A.: Variational data assimilation using targetted random walks. Int. J. Numer. Method Fluids (2011). doi: 10.1002/fld.2510 · Zbl 1426.76629 [21] Stroock, D.W., Varadhan, S.S.: Multidimensional Diffussion Processes, vol. 233. Springer, New York (1979) · Zbl 0426.60069 [22] Ethier, S.N., Kurtz, T.G.: Markov Processes. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. Wiley, New York (1986). Characterization and convergence · Zbl 0592.60049 [23] Kupferman, R., Stuart, A., Terry, J., Tupper, P.: Long-term behaviour of large mechanical systems with random initial data. Stoch. Dyn. 2(04), 533–562 (2002) · Zbl 1032.70009 · doi:10.1142/S0219493702000571 [24] Berger, E.: Asymptotic behaviour of a class of stochastic approximation procedures. Probab. Theory Relat. Fields 71(4), 517–552 (1986) · Zbl 0571.62073 · doi:10.1007/BF00699040 [25] Henry, D.: Geometric Theory of Semilinear Parabolic Equations, vol. 61. Springer, New York (1981) · Zbl 0456.35001 [26] Hairer, M., Stuart, A.M., Voss, J., Wiberg, P.: Analysis of spdes arising in path sampling. part 1: the gaussian case. Comm. Math. Sci. 3, 587–603 (2005) · Zbl 1138.60326 · doi:10.4310/CMS.2005.v3.n4.a8 [27] Chorin, A., Hald, O.: Stochastic Tools in Mathematics and Science. Springer, New York (2006) · Zbl 1086.60001 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.