×

Generalization of partitioned Runge-Kutta methods for adjoint systems. (English) Zbl 1459.65093

Summary: This study computes the gradient of a function of numerical solutions of ordinary differential equations (ODEs) with respect to the initial condition. The adjoint method computes the gradient approximately by solving the corresponding adjoint system numerically. In this context J. M. Sanz-Serna [SIAM Rev. 58, No. 1, 3–33 (2016; Zbl 1339.65243)] showed that when the initial value problem is solved by a Runge-Kutta (RK) method, the gradient can be exactly computed by applying an appropriate RK method to the adjoint system. Focusing on the case where the initial value problem is solved by a partitioned RK (PRK) method, this paper presents a numerical method, which can be seen as a generalization of PRK methods, for the adjoint system that gives the exact gradient.

MSC:

65K10 Numerical optimization and variational techniques
65L06 Multistep, Runge-Kutta and extrapolation methods for ordinary differential equations
65L09 Numerical solution of inverse problems involving ordinary differential equations
65P10 Numerical methods for Hamiltonian systems including symplectic integrators

Citations:

Zbl 1339.65243

Software:

torchdiffeq
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Dimet, F. X.L.; Talagrand, O., Variational algorithms for analysis and assimilation of meteorological observations: theoretical aspects, Tellus A, 38A, 97-110 (1986)
[2] Fichtner, A.; Bunge, H. P.; Igel, H., The adjoint method in seismology: I. theory, Phys. Earth Planet. Inter., 157, 86-104 (2006) · Zbl 1175.86007
[3] Giles, N. A.; Michael, B.; Pierce, H., An introduction to the adjoint approach to design, Flow Turbul. Combust., 65, 393-415 (2000) · Zbl 0996.76023
[4] Chen, R. T.Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D. K., Neural ordinary differential equations, (Advances in Neural Information Processing Systems, Vol. 31 (2018)), 6571-6583
[5] Sanz-Serna, J. M., Symplectic Runge-Kutta schemes for adjoint equations, automatic differentiation, optimal control, and more, SIAM Rev., 58, 3-33 (2016) · Zbl 1339.65243
[6] Haber, E.; Ruthotto, L., Stable architectures for deep neural networks, Inverse Problems, 34, Article 014004 pp. (2018) · Zbl 1426.68236
[7] Goodfellow, I.; Bengio, Y.; Courville, A., Deep Learning (2016), MIT Press, http://www.deeplearningbook.org · Zbl 1373.68009
[8] Wang, Z.; Droegemeier, K.; White, L., The adjoint Newton algorithm for large-scale unconstrained optimization in meteorology applications, Comput. Optim. Appl., 10, 283-320 (1998) · Zbl 0912.90265
[9] Wang, Z.; Navon, I. M.; Dimet, F. X.L.; Zou, X., The second order adjoint analysis: Theory and applications, Meteorol. Atmos. Phys., 50, 3-20 (1992)
[10] S. Ito, T. Matsuda, Y. Miyatake, Adjoint-based exact Hessian computation, BIT, in press, https://doi.org/10.1007/s10543-020-00833-0. · Zbl 1470.65126
[11] Ito, S.; Nagao, H.; Yamanaka, A.; Tsukada, Y.; Koyama, T.; Kano, M.; Inoue, J., Data assimilation for massive autonomous systems based on a second-order adjoint method, Phys. Rev. E, 94, Article 043307 pp. (2016)
[12] Thacker, W. C., The role of the hessian matrix in fitting models to measurements, J. Geophys. Res. Ocean., 94, 6177-6196 (1989)
[13] Matsuda, T.; Miyatake, Y., Estimation of ordinary differential equation models with discretization error quantification (2019), arXiv:1907.10565
[14] Hairer, E.; Lubich, C.; Wanner, G., Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Quations (2006), Springer-Verlag: Springer-Verlag Berlin · Zbl 1094.65125
[15] Abia, L.; Sanz-Serna, J. M., Partitioned Runge-Kutta methods for separable Hamiltonian problems, Math. Comp., 60, 617-634 (1993) · Zbl 0777.65042
[16] Sun, G., Symplectic partitioned Runge-Kutta methods, J. Comput. Math., 11, 365-372 (1993) · Zbl 0789.65049
[17] Ketcheson, D. I.; MacDonald, C. B.; Ruuth, S. J., Spatially partitioned embedded Runge-Kutta methods, SIAM J. Numer. Anal., 51, 2887-2910 (2013) · Zbl 1285.65061
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.