×

Sequential quadratic programming with indefinite Hessian approximations for nonlinear optimum experimental design for parameter estimation in differential-algebraic equations. (English) Zbl 1326.65076

Heidelberg: Univ. Heidelberg, Naturwissenschaftlich-Mathematische Gesamtfakultät (Diss.). 222 p. (2015).
Summary: In this thesis we develop algorithms for the numerical solution of problems from nonlinear optimum experimental design (OED) for parameter estimation in differential-algebraic equations. These OED problems can be formulated as special types of path- and control-constrained optimal control (OC) problems. The objective is to minimize a functional on the covariance matrix of the model parameters that is given by first-order sensitivities of the model equations. Additionally, the objective is nonlinearly coupled in time, which make OED problems a challenging class of OC problems. For their numerical solution, we propose a direct multiple shooting parameterization to obtain a structured nonlinear programming problem (NLP). An augmented system of nominal and variational states for the model sensitivities is parameterized on multiple shooting intervals and the objective is decoupled by means of additional variables and constraints. In the resulting NLP, we identify several structures that allow to evaluate derivatives at greatly reduced costs compared to a standard OC formulation.
For the solution of the block-structured NLPs, we develop a new sequential quadratic programming (SQP) method. Therein, partitioned quasi-Newton updates are used to approximate the block-diagonal Hessian of the Lagrangian. We analyze a model problem with indefinite, block-diagonal Hessian and prove that positive definite approximations of the individual blocks prevent superlinear convergence. For an OED model problem, we show that more and more negative eigenvalues appear in the Hessian as the multiple shooting grid is refined and confirm the detrimental impact of positive definite Hessian approximations. Hence, we propose indefinite SR1 updates to guarantee fast local convergence. We develop a filter line search globalization strategy that accepts indefinite Hessians based on a new criterion derived from the proof of global convergence. BFGS updates with a scaling strategy to prevent large eigenvalues are used as fallback if the SR1 update does not promote convergence. For the solution of the arising sparse and nonconvex quadratic subproblems, a parametric active set method with inertia control within a Schur complement approach is developed. It employs a symmetric, indefinite LBL\(^{\mathrm T}\)-factorization for the large, sparse KKT matrix and maintains and updates QR-factors of a small and dense Schur complement.
The new methods are complemented by two C++ implementations: muse transforms an OED or OC problem instance to a structured NLP by means of direct multiple shooting. A special feature is that fully independent grids for controls, states, path constraints, and measurements are maintained. This provides higher flexibility to adapt the NLP formulation to the characteristics of the problem at hand and facilitates comparison of different formulations in the light of the lifted Newton method. The software package blockSQP is an implementation of the new SQP method that uses a newly developed variant of the quadratic programming solver qpOASES. Numerical results are presented for a benchmark collection of OED and OC problems that show how SR1 approximations improve local convergence over BFGS. The new method is then applied to two challenging OED applications from chemical engineering. Its performance compares favorably to an available existing implementation.

MSC:

65K10 Numerical optimization and variational techniques
49J15 Existence theories for optimal control problems involving ordinary differential equations
49M37 Numerical methods based on nonlinear programming
65L80 Numerical methods for differential-algebraic equations
62K05 Optimal statistical designs
62F10 Point estimation
90C30 Nonlinear programming
90C35 Programming involving graphs or networks
PDFBibTeX XMLCite
Full Text: Link