Newton step methods for AD of an objective defined using implicit functions. (English) Zbl 1453.65105

Summary: We consider the problem of computing derivatives of an objective that is defined using implicit functions; i.e., implicit variables are computed by solving equations that are often nonlinear and solved by an iterative process. If one were to apply Algorithmic Differentiation (AD) directly, one would differentiate the iterative process. In this paper we present the Newton step methods for computing derivatives of the objective. These methods make it easy to take advantage of sparsity, forward mode, reverse mode, and other AD techniques. We prove that the partial Newton step method works if the number of steps is equal to the order of the derivatives. The full Newton step method obtains two derivatives order for each step except for the first step. There are alternative methods that avoid differentiating the iterative process; e.g., the method implemented in ADOL-C. An optimal control example demonstrates the advantage of the Newton step methods when computing both gradients and Hessians. We also discuss the Laplace approximation method for nonlinear mixed effects models as an example application.


65H10 Numerical computation of solutions to systems of equations
26B10 Implicit function theorems, Jacobians, transformations with several variables
49M15 Newton-type methods
65K05 Numerical mathematical programming methods
65Y99 Computer aspects of numerical algorithms
Full Text: DOI


[1] Bartholomew-Biggs, M.; Brown, S.; Christianson, B., automatic differentiation of algorithms, J. Comput. Appl. Math., 124, 117-190, (2000) · Zbl 0994.65020
[2] algorithmic differentiation of implicit functions and optimal valuesAdvances in Automatic DifferentiationSpringer VerlagBerlin Heidelberg20086777
[3] Bock, H. G.; Kostina, E.; Kostyukova, O., covariance matrices for parameter estimates of constrained parameter estimation problems, SIAM J. Matrix Anal. Appl., 29, 626-642, (2007) · Zbl 1138.65008
[4] Boik, R. J., an implicit function approach to constrained optimization with applications to asymptotic expansions, J. Multivar. Anal., 99, 465-489, (2008) · Zbl 1132.41320
[5] Brent, R.; Kung, H., fast algorithms for manipulating formal power series, J. Assoc. Comput. Mach., 25, 581-595, (1978) · Zbl 0388.68052
[6] Christianson, B., reverse accumulation and implicit functions, Optim. Methods Softw., 9, 307-322, (1998) · Zbl 0922.65013
[7] Efron, B.; Hinkley, D. V., assessing the accuracy of the maximum likelihood estimator: observed versus expected Fisher information, Biometrika, 65, 457-482, (1978) · Zbl 0401.62002
[8] Gibbons, R. D.; Hedeker, D., applications of mixed-effects models in biostatistics, Ind. J. Stat. Ser. B (1960-2002), 62, 70-103, (2000) · Zbl 0973.62099
[9] Gilbert, J. C., automatic differentiation and iterative processes, Optim. Methods Softw., 1, 13-21, (1992)
[10] collected matrix derivative results for forward and reverse mode algorithmic differentiationAdvances in Automatic DifferentiationSpringer Verlag20083544
[11] Griewank, A.; Walther, A., algorithm 799: revolve: an implementation of checkpointing for the reverse or adjoint mode of computational differentiation, ACM Trans. Math. Softw., 26, 19-45, (2000) · Zbl 1137.65330
[12] Griewank, A.; Walther, A., Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, (2008), SIAM, Philadelphia, PA · Zbl 1159.65026
[13] Hestenes, M. R.; Stiefel, E., methods of conjugate gradients for solving linear systems, J. Res. Natl. Bureau Standards, 49, 409-436, (1952) · Zbl 0048.09901
[14] Junkins, J. L.; Turner, J. D.; Majji, M., generalizations and applications of the Lagrange implicit function theorem, J. Astronaut. Sci., 57, 313-345, (2009)
[15] Kalashnikov, V. V.; Dempe, S.; Pérez-Valdés, G. A.; Kalashnykova, N. I.; Camacho-Vallejo, J. F., bilevel programming and applications, Math. Probl. Eng., 2015, 465-489, (2015)
[16] Kalmikov, A. G.; Heimbach, P., A hessian-based method for uncertainty quantification in global Ocean state estimation, SIAM J. Sci. Comput., 36, S267-S295, (2014) · Zbl 1311.35321
[17] Kedem, G., automatic differentiation of computer programs, ACM Trans. Math. Softw., 6, 150-165, (1980) · Zbl 0441.68041
[18] Kristensen, K.; Nielsen, A.; Berg, C. W.; Skaug, H.; Bell, B. M., TMB: automatic differentiation and Laplace approximation, J. Stat. Softw., 70, 1-21, (2016)
[19] Park, C.; Scheeres, D. J., determination of optimal feedback terminal controllers for general boundary conditions using generating functions, Automatica, 42, 869-875, (2006) · Zbl 1137.49020
[20] Pinheiro, J. C.; Bates, D. M., approximations to the log-likelihood function in the nonlinear mixed-effects model, J. Comput. Graph. Stat., 4, 12-35, (1995)
[21] Rue, H.; Martino, S., approximate Bayesian inference for hierarchical Gaussian Markov random field models, J. Stat. Plann. Inf., 137, 3177-3192, (2007) · Zbl 1114.62025
[22] Savard, G.; Gauvin, J., steepest descent direction for the nonlinear bilevel programming problem, Oper. Res. Lett., 15, 265-273, (1994) · Zbl 0816.90122
[23] Skaug, H. J.; Fournier, D. A., automatic approximation of the marginal likelihood in non-Gaussian hierarchical models, Comput. Stat. Data Anal., 51, 699-709, (2006) · Zbl 1157.65317
[24] Topputo, F.; Bernelli-Zazzera, F., Approximate Solutions to Nonlinear Optimal Control Problems in Astrodynamics, (2013), ISRN Aerospace Engineering
[25] Wagner, M.; Walther, A.; Schaefer, B. J., on the efficient computation of high-order derivatives for implicitly defined functions, Comput. Phys. Commun., 181, 756-764, (2010) · Zbl 1205.65131
[26] Structured higher-order algorithmic differentiation in the forward and reverse mode with application in optimum experimental design, Ph.D. diss., Humboldt-Universität zu Berlin2011
[27] Walter, S. F.; Lehmann, L., algorithmic differentiation in python with algopy, J. Comput. Sci., 4, 334-344, (2013)
[28] ADOL-C: A Package for the Automatic Differentiation of Algorithms Written in C/C++, Computational infrastructure for operations research: coin-or (2014), Available at
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.