×

Convergence acceleration of ensemble Kalman inversion in nonlinear settings. (English) Zbl 1486.49049

Summary: Many data-science problems can be formulated as an inverse problem, where the parameters are estimated by minimizing a proper loss function. When complicated black-box models are involved, derivative-free optimization tools are often needed. The ensemble Kalman filter (EnKF) is a particle-based derivative-free Bayesian algorithm originally designed for data assimilation. Recently, it has been applied to inverse problems for computational efficiency. The resulting algorithm, known as ensemble Kalman inversion (EKI), involves running an ensemble of particles with EnKF update rules so they can converge to a minimizer. In this article, we investigate EKI convergence in general nonlinear settings. To improve convergence speed and stability, we consider applying EKI with non-constant step-sizes and covariance inflation. We prove that EKI can hit critical points with finite steps in non-convex settings. We further prove that EKI converges to the global minimizer polynomially fast if the loss function is strongly convex. We verify the analysis presented with numerical experiments on two inverse problems.

MSC:

49N45 Inverse problems in optimal control
65K10 Numerical optimization and variational techniques
90C56 Derivative-free methods and methods using generalized derivatives
90C25 Convex programming
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] JLA07 J. L. Anderson, An adaptive covariance inflation error correction algorithms for ensemble filters, Tellus A 59 (2007), 210-224.
[2] JLA09 J. L. Anderson, Spatially and temporally varying adaptive covariance inflation for ensemble filters, Tellus A 61 (2009), no. 1, 72-83.
[3] Bauer, Frank; Hohage, Thorsten; Munk, Axel, Iteratively regularized Gauss-Newton method for nonlinear inverse problems with random noise, SIAM J. Numer. Anal., 47, 3, 1827-1846 (2009) · Zbl 1197.65060 · doi:10.1137/080721789
[4] Bell, Bradley M.; Cathey, Frederick W., The iterated Kalman filter update as a Gauss-Newton method, IEEE Trans. Automat. Control, 38, 2, 294-297 (1993) · Zbl 0775.93237 · doi:10.1109/9.250476
[5] Bernstein, Dennis S., Matrix mathematics, xxxviii+726 pp. (2005), Princeton University Press, Princeton, NJ · Zbl 1075.15001
[6] Bertsekas, Dimitri P., Incremental least squares methods and the extended Kalman filter, SIAM J. Optim., 6, 3, 807-822 (1996) · Zbl 0945.93026 · doi:10.1137/S1052623494268522
[7] Benning, Martin; Burger, Martin, Modern regularization methods for inverse problems, Acta Numer., 27, 1-111 (2018) · Zbl 1431.65080 · doi:10.1017/s0962492918000016
[8] Bertsimas, Dimitris; Tsitsiklis, John, Simulated annealing. Probability and algorithms, 17-29 (1992), Nat. Acad. Press, Washington, DC · Zbl 0764.60073
[9] Beskos, Alexandros; Crisan, Dan; Jasra, Ajay, On the stability of sequential Monte Carlo methods in high dimensions, Ann. Appl. Probab., 24, 4, 1396-1445 (2014) · Zbl 1304.82070 · doi:10.1214/13-AAP951
[10] BEM01 C. Bishop, B. Etherton, and S. Majumdar, Adaptive sampling with the ensemble transform kalman filter part i: the theoretical aspects, Monthly Weather Rev. 129 (2001), 420-436.
[11] Bl\"{o}mker, Dirk; Schillings, Claudia; Wacker, Philipp; Weissmann, Simon, Well posedness and convergence analysis of the ensemble Kalman inversion, Inverse Problems, 35, 8, 085007, 32 pp. (2019) · Zbl 1432.62325 · doi:10.1088/1361-6420/ab149c
[12] Bui-Thanh, T.; Girolami, M., Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo, Inverse Problems, 30, 11, 114014, 23 pp. (2014) · Zbl 1306.65269 · doi:10.1088/0266-5611/30/11/114014
[13] Burger, Martin; Osher, Stanley, Convergence rates of convex variational regularization, Inverse Problems, 20, 5, 1411-1421 (2004) · Zbl 1068.65085 · doi:10.1088/0266-5611/20/5/005
[14] NKC17 N. K. Chada, Analysis of hierarchical ensemble Kalman inversion, preprint 1801.00847, 2018.
[15] Chada, Neil K.; Iglesias, Marco A.; Roininen, Lassi; Stuart, Andrew M., Parameterizations for ensemble Kalman inversion, Inverse Problems, 34, 5, 055009, 31 pp. (2018) · Zbl 1515.62041 · doi:10.1088/1361-6420/aab6d9
[16] Chada, Neil K.; Stuart, Andrew M.; Tong, Xin T., Tikhonov regularization within ensemble Kalman inversion, SIAM J. Numer. Anal., 58, 2, 1263-1294 (2020) · Zbl 1447.35337 · doi:10.1137/19M1242331
[17] Chen, Xi; Lee, Jason D.; Tong, Xin T.; Zhang, Yichen, Statistical inference for model parameters in stochastic gradient descent, Ann. Statist., 48, 1, 251-273 (2020) · Zbl 1440.62287 · doi:10.1214/18-AOS1801
[18] Reich, Sebastian; Cotter, Colin J., Ensemble filter techniques for intermittent data assimilation. Large scale inverse problems, Radon Ser. Comput. Appl. Math. 13, 91-134 (2013), De Gruyter, Berlin · Zbl 1291.65032
[19] Dashti, M.; Law, K. J. H.; Stuart, A. M.; Voss, J., MAP estimators and their consistency in Bayesian nonparametric inverse problems, Inverse Problems, 29, 9, 095017, 27 pp. (2013) · Zbl 1281.62089 · doi:10.1088/0266-5611/29/9/095017
[20] Del Moral, Pierre; Doucet, Arnaud; Jasra, Ajay, Sequential Monte Carlo samplers, J. R. Stat. Soc. Ser. B Stat. Methodol., 68, 3, 411-436 (2006) · Zbl 1105.62034 · doi:10.1111/j.1467-9868.2006.00553.x
[21] MMD19 M. M. Dunlop, Multiplicative noise in Bayesian inverse problems: Well-posedness and consistency of MAP estimators, preprint 1910.14632, 2019.
[22] Duchi, John; Hazan, Elad; Singer, Yoram, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res., 12, 2121-2159 (2011) · Zbl 1280.68164
[23] Engl, Heinz W.; Hanke, Martin; Neubauer, Andreas, Regularization of inverse problems, Mathematics and its Applications 375, viii+321 pp. (1996), Kluwer Academic Publishers Group, Dordrecht · Zbl 0859.65054
[24] Evensen, Geir, Data assimilation, xxiv+307 pp. (2009), Springer-Verlag, Berlin · Zbl 1395.93534 · doi:10.1007/978-3-642-03711-5
[25] GE03 G. Evensen, The ensemble Kalman filter: Theoretical formulation and practical implementation, Ocean Dyn. 53 (2003), no. 4, 343-367.
[26] Garbuno-Inigo, Alfredo; Hoffmann, Franca; Li, Wuchen; Stuart, Andrew M., Interacting Langevin diffusions: gradient structure and ensemble Kalman sampler, SIAM J. Appl. Dyn. Syst., 19, 1, 412-441 (2020) · Zbl 1447.65119 · doi:10.1137/19M1251655
[27] GSS93 N. J. Gordon, D.J. Salmond, and A. F. M. Smith, Novel approach to nonlinear/non-Gaussian Bayesian state estimation, IEE Proc.F - Radar Signal Process. 140 (1993), no. 2.
[28] Gratton, S.; Lawless, A. S.; Nichols, N. K., Approximate Gauss-Newton methods for nonlinear least squares problems, SIAM J. Optim., 18, 1, 106-132 (2007) · Zbl 1138.65046 · doi:10.1137/050624935
[29] HLR18 E. Haber, F. Lucka, and L. Ruthotto, Never look back - A modified EnKF method and its application to the training of neural networks without back propagation, preprint 1805.08034, 2018.
[30] Hanke, Martin, A regularizing Levenberg-Marquardt scheme, with applications to inverse groundwater filtration problems, Inverse Problems, 13, 1, 79-95 (1997) · Zbl 0873.65057 · doi:10.1088/0266-5611/13/1/007
[31] HW05 T. M. Hamill and J. S. Whitaker, Accounting for the error due to unresolved scales in ensemble data assimilation: a comparison of different approaches, Monthly Weather Rev. 133 (2005), no. 11, 3132-3147.
[32] HM01 P. L. Houtekamer and H. L. Mitchell, A sequential ensemble kalman filter for atmospheric data assimilation, Monthly Weather Rev. 129 (2001), no. 1, 123-137.
[33] Iglesias, Marco A., A regularizing iterative ensemble Kalman method for PDE-constrained inverse problems, Inverse Problems, 32, 2, 025002, 45 pp. (2016) · Zbl 1334.65110 · doi:10.1088/0266-5611/32/2/025002
[34] Iglesias, Marco A.; Law, Kody J. H.; Stuart, Andrew M., Ensemble Kalman methods for inverse problems, Inverse Problems, 29, 4, 045001, 20 pp. (2013) · Zbl 1311.65064 · doi:10.1088/0266-5611/29/4/045001
[35] Iglesias, Marco; Park, Minho; Tretyakov, M. V., Bayesian inversion in resin transfer molding, Inverse Problems, 34, 10, 105002, 46 pp. (2018) · Zbl 1446.65097 · doi:10.1088/1361-6420/aad1cc
[36] KU97 S. J. Julier and J. K. Uhlmann, New extension of the Kalman filter to nonlinear systems, Signal processing, sensor fusion, and target recognition VI, International Society for Optics and Photonics vol. 3068, 1997.
[37] Kalman, R. E., A new approach to linear filtering and prediction problems, Trans. ASME Ser. D. J. Basic Engrg., 82, 1, 35-45 (1960)
[38] Kantas, Nikolas; Beskos, Alexandros; Jasra, Ajay, Sequential Monte Carlo methods for high-dimensional inverse problems: a case study for the Navier-Stokes equations, SIAM/ASA J. Uncertain. Quantif., 2, 1, 464-489 (2014) · Zbl 1308.65010 · doi:10.1137/130930364
[39] Kelley, C. T., Iterative methods for optimization, Frontiers in Applied Mathematics 18, xvi+180 pp. (1999), Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA · Zbl 0934.90082 · doi:10.1137/1.9781611970920
[40] Kelly, D. T. B.; Law, K. J. H.; Stuart, A. M., Well-posedness and accuracy of the ensemble Kalman filter in discrete and continuous time, Nonlinearity, 27, 10, 2579-2604 (2014) · Zbl 1305.62323 · doi:10.1088/0951-7715/27/10/2579
[41] KMT16 D. T. Kelly, A. J. Majda, and X. T. Tong, Concrete ensemble Kalman filters with rigorous catastrophic filter divergence, Proc. Natl. Acad. Sci., 112 (2016), no. 34, 10589-10594, 2016.
[42] Kirkpatrick, S.; Gelatt, C. D., Jr.; Vecchi, M. P., Optimization by simulated annealing, Science, 220, 4598, 671-680 (1983) · Zbl 1225.90162 · doi:10.1126/science.220.4598.671
[43] Kovachki, Nikola B.; Stuart, Andrew M., Ensemble Kalman inversion: a derivative-free technique for machine learning tasks, Inverse Problems, 35, 9, 095005, 35 pp. (2019) · Zbl 1430.68266 · doi:10.1088/1361-6420/ab1c3a
[44] Kwiatkowski, Evan; Mandel, Jan, Convergence of the square root ensemble Kalman filter in the large ensemble limit, SIAM/ASA J. Uncertain. Quantif., 3, 1, 1-17 (2015) · Zbl 1329.60083 · doi:10.1137/140965363
[45] Lange, Theresa; Stannat, Wilhelm, On the continuous time limit of the ensemble Kalman filter, Math. Comp., 90, 327, 233-265 (2021) · Zbl 1458.60082 · doi:10.1090/mcom/3588
[46] Lange, Theresa; Stannat, Wilhelm, On the continuous time limit of ensemble square root filters, Commun. Math. Sci., 19, 7, 1855-1880 (2021) · Zbl 1477.93163
[47] Law, Kody; Stuart, Andrew; Zygalakis, Konstantinos, Data assimilation, Texts in Applied Mathematics 62, xviii+242 pp. (2015), Springer, Cham · Zbl 1353.60002 · doi:10.1007/978-3-319-20325-6
[48] LR09 G. Li and A. C. Reynolds, Iterative ensemble Kalman filters for data assimilation, SPE J. 14 (2009), 496-505.
[49] Le Gland, F.; Monbet, V.; Tran, V.-D., Large sample asymptotics for the ensemble Kalman filter. The Oxford handbook of nonlinear filtering, 598-631 (2011), Oxford Univ. Press, Oxford · Zbl 1225.93108
[50] Liu, Qiang; Tong, Xin T., Accelerating Metropolis-within-Gibbs sampler with localized computations of differential equations, Stat. Comput., 30, 4, 1037-1056 (2020) · Zbl 1447.62135 · doi:10.1007/s11222-020-09934-w
[51] Livings, David M.; Dance, Sarah L.; Nichols, Nancy K., Unbiased ensemble square root filters, Phys. D, 237, 8, 1021-1028 (2008) · Zbl 1147.93413 · doi:10.1016/j.physd.2008.01.005
[52] Lord, Gabriel J.; Powell, Catherine E.; Shardlow, Tony, An introduction to computational stochastic PDEs, Cambridge Texts in Applied Mathematics, xii+503 pp. (2014), Cambridge University Press, New York · Zbl 1327.60011 · doi:10.1017/CBO9781139017329
[53] ENL96 E. N. Lorenz, Predictability: A problem partly solved, Proc. ECMWF Semin. predictability, 1 (1986), 1-18.
[54] Majda, Andrew J.; Harlim, John, Filtering complex turbulent systems, x+357 pp. (2012), Cambridge University Press, Cambridge · Zbl 1250.93002 · doi:10.1017/CBO9781139061308
[55] Majda, Andrew J.; Tong, Xin T., Performance of ensemble Kalman filters in large dimensions, Comm. Pure Appl. Math., 71, 5, 892-937 (2018) · Zbl 1390.93799 · doi:10.1002/cpa.21722
[56] Mandel, Jan; Cobb, Loren; Beezley, Jonathan D., On the convergence of the ensemble Kalman filter, Appl. Math., 56, 6, 533-541 (2011) · Zbl 1248.62164 · doi:10.1007/s10492-011-0031-2
[57] Moriyama, Hiroyuki; Yamashita, Nobuo; Fukushima, Masao, The incremental Gauss-Newton algorithm with adaptive stepsize rule, Comput. Optim. Appl., 26, 2, 107-141 (2003) · Zbl 1081.90050 · doi:10.1023/A:1025703629626
[58] Morzfeld, M.; Tong, X. T.; Marzouk, Y. M., Localization for MCMC: sampling high-dimensional posterior distributions with local structure, J. Comput. Phys., 380, 1-28 (2019) · Zbl 1451.65011 · doi:10.1016/j.jcp.2018.12.008
[59] Nesterov, Yurii, Introductory lectures on convex optimization, Applied Optimization 87, xviii+236 pp. (2004), Kluwer Academic Publishers, Boston, MA · Zbl 1086.90045 · doi:10.1007/978-1-4419-8853-9
[60] Nocedal, Jorge; Wright, Stephen J., Numerical optimization, Springer Series in Operations Research and Financial Engineering, xxii+664 pp. (2006), Springer, New York · Zbl 1104.65059
[61] ORL08 D. Oliver, A. C. Reynolds, and N. Liu, Inverse theory for petroleum reservoir characterization and history matching, Cambridge University Press, 2008.
[62] Patel, Vivak, Kalman-based stochastic gradient method with stop condition and insensitivity to conditioning, SIAM J. Optim., 26, 4, 2620-2648 (2016) · Zbl 1388.93091 · doi:10.1137/15M1048239
[63] Plagianakos, V. P.; Magoulas, G. D.; Vrahatis, M. N., Learning rate adaptation in stochastic gradient descent. Advances in convex analysis and global optimization, Pythagorion, 2000, Nonconvex Optim. Appl. 54, 433-444 (2001), Kluwer Acad. Publ., Dordrecht · Zbl 1049.90139 · doi:10.1007/978-1-4613-0279-7\_27
[64] Polyak, B. T.; Juditsky, A. B., Acceleration of stochastic approximation by averaging, SIAM J. Control Optim., 30, 4, 838-855 (1992) · Zbl 0762.62022 · doi:10.1137/0330046
[65] RCB15 P. N. Raanes, A. Carrassi, and L. Bertino, Extending the square root method to account for additive forecast noise in ensemble methods, Monthly Weather Rev. 143 (2015), no. 10, 3857-3873.
[66] Reich, Sebastian, Data assimilation: the Schr\"{o}dinger perspective, Acta Numer., 28, 635-711 (2019) · Zbl 1437.62350 · doi:10.1017/s0962492919000011
[67] Reich, Sebastian; Cotter, Colin, Probabilistic forecasting and Bayesian data assimilation, x+297 pp. (2015), Cambridge University Press, New York · Zbl 1314.62005 · doi:10.1017/CBO9781107706804
[68] Reif, Konrad; G\"{u}nther, Stefan; Yaz, Engin; Unbehauen, Rolf, Stochastic stability of the discrete-time extended Kalman filter, IEEE Trans. Automat. Control, 44, 4, 714-728 (1999) · Zbl 0967.93090 · doi:10.1109/9.754809
[69] Rios, Luis Miguel; Sahinidis, Nikolaos V., Derivative-free optimization: a review of algorithms and comparison of software implementations, J. Global Optim., 56, 3, 1247-1293 (2013) · Zbl 1272.90116 · doi:10.1007/s10898-012-9951-y
[70] Robbins, Herbert; Monro, Sutton, A stochastic approximation method, Ann. Math. Statistics, 22, 400-407 (1951) · Zbl 0054.05901 · doi:10.1214/aoms/1177729586
[71] Schillings, Claudia; Stuart, Andrew M., Analysis of the ensemble Kalman filter for inverse problems, SIAM J. Numer. Anal., 55, 3, 1264-1290 (2017) · Zbl 1366.65101 · doi:10.1137/16M105959X
[72] SAN05 M. Schweiger, S. R. Arridge, and I. Nissila, Gauss-Newton method for image reconstruction in diffuse optical tomography, Phys. Med. Biol. 50 (2005), 2365-2386.
[73] Stuart, A. M., Inverse problems: a Bayesian perspective, Acta Numer., 19, 451-559 (2010) · Zbl 1242.65142 · doi:10.1017/S0962492910000061
[74] Sullivan, T. J., Introduction to uncertainty quantification, Texts in Applied Mathematics 63, xii+342 pp. (2015), Springer, Cham · Zbl 1336.60002 · doi:10.1007/978-3-319-23395-6
[75] Tarantola, Albert, Inverse problem theory, xvi+613 pp. (1987), Elsevier Science Publishers, B.V., Amsterdam · Zbl 0875.65001
[76] TAB03 M. K. Tippett, J. L. Anderson, C. H. Bishop, T. M. Hamill, and C. Whitaker, Ensemble square root filters, Mon. Wea. Rev. 131 (2003), 1485-1490.
[77] Tong, Xin T., Performance analysis of local ensemble Kalman filter, J. Nonlinear Sci., 28, 4, 1397-1442 (2018) · Zbl 1394.60038 · doi:10.1007/s00332-018-9453-2
[78] Tong, Xin T.; Majda, Andrew J.; Kelly, David, Nonlinear stability of the ensemble Kalman filter with adaptive covariance inflation, Commun. Math. Sci., 14, 5, 1283-1313 (2016) · Zbl 1370.37009 · doi:10.4310/CMS.2016.v14.n5.a5
[79] WM00 E. A. Wan and R. Van Der Merwe, The unscented Kalman filter for nonlinear estimation, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium, IEEE, 2000.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.