×

Implicit dual control based on particle filtering and forward dynamic programming. (English) Zbl 1298.93353

Summary: This paper develops a sampling-based approach to implicit dual control. Implicit dual control methods synthesize stochastic control policies by systematically approximating the stochastic dynamic programming equations of Bellman, in contrast to explicit dual control methods that artificially induce probing into the control law by modifying the cost function to include a term that rewards learning. The proposed implicit dual control approach is novel in that it combines a particle filter with a policy-iteration method for forward dynamic programming. The integration of the two methods provides a complete sampling-based approach to the problem. Implementation of the approach is simplified by making use of a specific architecture denoted as a H-block. Practical suggestions are given for reducing computational loads within the H-block for real-time applications. As an example, the method is applied to the control of a stochastic pendulum model having unknown mass, length, initial position and velocity, and unknown sign of its dc gain. Simulation results indicate that active controllers based on the described method can systematically improve closed-loop performance with respect to other more common stochastic control approaches.

MSC:

93E20 Optimal stochastic control
90C39 Dynamic programming
49N15 Duality theory (optimization)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] DoucetA, de FreitasN, GordonN (eds). Sequential Monte Carlo Methods in Practice. Springer: New York, 2001. · Zbl 0967.00022
[2] Gilks, Markov Chain Monte Carlo in Practice (1996)
[3] Ristic, Beyond the Kalman Filter: Particle Filters for Tracking Applications (2004)
[4] Bayard, A forward method for optimal stochastic nonlinear and adaptive control, IEEE Transactions on Automatic Control 36 (9) pp 1046– (1991) · Zbl 0754.93082
[5] Bayard, Implicit dual control for general stochastic systems, Optimal Control Applications and Methods 6 pp 265– (1985) · Zbl 0575.93072
[6] Salmond, Sequential Monte Carlo Methods in Practice (2001)
[7] Bayard DS, Schumitzky A. Implicit dual control based on particle filtering and forward dynamic programming. Report 2007-1, USC Laboratory of Pharmacokinetics 2007; 19.
[8] Bar-Shalom, Control and Dynamics Systems pp 99– (1976)
[9] Bertsekas, Dynamic Programming: Deterministic and Stochastic Models (1987)
[10] Bayard, Proof of quasi-adaptivity for the m-measurement class of feedback control policies, IEEE Transactions on Automatic Control 32 (5) pp 447– (1987) · Zbl 0621.93077
[11] Dreyfus, Some types of optimal control of stochastic systems, SIAM Journal on Control 2 pp 120– (1964) · Zbl 0144.12501
[12] Bar-Shalom, The optimal control of discrete time systems with random parameters, IEEE Transactions on Automatic Control 14 (1) pp 3– (1969)
[13] Tse, Adaptive stochastic control for a class of linear systems, IEEE Transactions on Automatic Control 17 (1) pp 38– (1972) · Zbl 0269.93035
[14] Alspach, Stochastic optimal control for linear but non-Gaussian systems, International Journal of Control 13 (6) pp 1169– (1971) · Zbl 0219.93027
[15] Bar-Shalom, Dual effect, certainty equivalence, and separation in stochastic control, IEEE Transactions on Automatic Control 19 (5) pp 494– (1974) · Zbl 0291.93071
[16] Goodwin, Adaptive Filtering Prediction and Control (1984) · Zbl 1250.93001
[17] Astrom, On self-tuning regulators, Automatica 9 pp 185– (1973)
[18] Feldbaum, Dual control theory I-IV, Automatic and Remote Control 21 pp 874– (1961)
[19] Feldbaum, Optimal Control Systems (1965)
[20] Filatov, Survey of adaptive dual control methods, IEE Proceedings on Control Theory and Applications 147 (1) pp 118– (2000)
[21] Wittenmark, Stochastic adaptive control methods: a survey, International Journal of Control 21 (5) pp 705– (1975) · Zbl 0303.93055
[22] Wittenmark B. Adaptive dual control methods: an overview. Fifth IFAC Symposium on Adaptive Systems in Control and Signal Processing, Budapest, 1995; 67-73.
[23] Filatov, Adaptive Dual Control (2005)
[24] Bar-Shalom, Stochastic dynamic programming: caution and probing, IEEE Transactions on Automatic Control 26 (5) pp 1184– (1981) · Zbl 0472.93071
[25] Bellman, Adaptive Control Processes: A Guided Tour (1961) · doi:10.1515/9781400874668
[26] Astrom, Dual control of an integrator with unknown gain, Computers and Mathematics with Applications 12A (6) pp 653– (1986) · Zbl 0614.93038 · doi:10.1016/0898-1221(86)90052-0
[27] Astrom, Problems of identification and control, Journal of Mathematical Analysis and Applications 34 pp 90– (1971) · Zbl 0434.92027
[28] Florentin, Optimal probing adaptive control of a simple Bayesian system, Journal of Electronics and Control 13 pp 165– (1962) · doi:10.1080/00207216208937430
[29] Jacobs, An optimal external control system, Automatica 6 pp 297– (1970) · Zbl 0198.49004
[30] Alster, A technique for dual adaptive control, Automatica 10 pp 627– (1974) · Zbl 0315.93013
[31] Filatov NM, Unbehauen H. Improved adaptive dual version of generalized minimum variance (GMV) controller. Proceedings of the 11th Yale Workshop on Application of Adaptive Systems Theory, Yale University, 1996; 137-142.
[32] Lindoff, Analysis of approximations of dual control, International Journal of Adaptive Control and Signal Processing 13 pp 593– (1999) · Zbl 0940.93079
[33] Milito, An innovations approach to dual control, IEEE Transactions on Automatic Control 27 (1) pp 132– (1982) · Zbl 0498.93071
[34] GelbA (ed.). Applied Optimal Estimation. The MIT Press: Cambridge, MA, 1984.
[35] Anderson, Optimal Filtering (1979)
[36] Sorenson, Recursive Bayesian estimation using Gaussian sums, Automatica 7 (4) pp 465– (1971) · Zbl 0219.93020
[37] Lianiotis, Partitioning: a unifying framework for adaptive, systems, I: estimation, Proceedings of the IEEE 64 pp 1126– (1976)
[38] Magill, Optimal adaptive estimation of sampled stochastic processes, IEEE Transactions on Automatic Control 10 (4) pp 434– (1965)
[39] Arulampalam, A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking, IEEE Transactions on Signal Processing 50 (2) (2002)
[40] Gordon, Novel approach to non-linear and non-Gaussian Bayesian state estimation, Proceedings of the Institution of Electrical Engineers 140 pp 107– (1993)
[41] Tse, Wide-sense adaptive dual control for nonlinear stochastic systems, IEEE Transactions on Automatic Control 18 (2) pp 98– (1973) · Zbl 0264.93035
[42] Tse, An actively adaptive control for linear systems with random parameters via the dual control approach, IEEE Transactions on Automatic Control 18 (2) pp 109– (1973) · Zbl 0264.93036
[43] Tse, Actively adaptive control for nonlinear stochastic systems, Proceedings of the IEEE 64 (8) pp 1172– (1976)
[44] Kulcsar, Dual control of linearly parameterised models via prediction of posterior densities, European Journal of Control 2 pp 135– (1996) · Zbl 0855.93102 · doi:10.1016/S0947-3580(96)70037-7
[45] Pronzato, An actively adaptive control policy for linear models, IEEE Transactions on Automatic Control 41 (6) pp 855– (1996) · Zbl 0850.93895
[46] Thompson, Stochastic iterative dynamic programming: a Monte Carlo approach to dual control, Automatica 41 pp 767– (2005) · Zbl 1098.90087
[47] Alspach, Dual control based on approximate a posteriori density functions, IEEE Transactions on Automatic Control 17 (5) pp 689– (1972) · Zbl 0261.93046
[48] Deshpande, Adaptive control of linear stochastic systems, Automatica 9 pp 107– (1973) · Zbl 0264.93038
[49] Lianiotis, Partitioning: a unifying framework for adaptive systems, II: control, Proceedings of the IEEE 64 pp 1179– (1976)
[50] Birmiwal, A new adaptive LQG control algorithm, International Journal of Adaptive Control and Signal Processing 8 pp 287– (1994) · Zbl 0800.93670
[51] Birmiwal, Dual control guidance for simultaneous identification and interception, Automatica 20 (6) pp 737– (1984) · Zbl 0554.93039
[52] Wenk, A multiple model adaptive dual control algorithm for stochastic systems with unknown parameters, IEEE Transactions on Automatic Control 25 (4) pp 703– (1980)
[53] Bayard, On the evaluation of expected performance cost for partially observed stochastic systems operating in closed-loop, International Journal of Control 42 (2) pp 443– (1985) · Zbl 0607.93047
[54] Kitagawa, Monte Carlo filter smoother for non-Gaussian non-linear state space models, Journal of Computational and Graphical Statistics 5 (1) pp 1– (1996)
[55] Liu, Sequential Monte Carlo Methods in Practice (2001)
[56] Bayard, Reduced complexity dynamic programming based on policy iteration, Journal of Mathematical Analysis and Applications 170 (1) pp 75– (1992) · Zbl 0774.49019
[57] Goldsman, Ranking and selection for steady-state simulation: procedures and prospectives, INFORMS Journal on Computing 14 pp 2– (2002)
[58] Nelson, Simple procedures for selecting the best simulated system when the number of alternatives is large, Operations Research 49 pp 950– (2001)
[59] Halliday, Physics: Parts I and II (1966)
[60] DeGroot, Probability and Statistics (1989)
[61] Schumitzky, Topics in Clinical Pharmacology (1986)
[62] Sheiner, Improved computer-assisted Digoxin therapy, Annals of Internal Medicine 82 pp 619– (1975) · doi:10.7326/0003-4819-82-5-619
[63] Bayard, Selected Topics in Mathematical Physics (1995)
[64] Jelliffe R, Bayard D, Schumitzky A, Milman M, Jiang F, Leonov S, Gandhi V, Gandhi A, Botnen A. Multiple model (MM) dosage design: achieving target goals with maximal precision. Fourteenth IEEE Symposium on Computer-based Medical Systems (CMBS’01), 26-27 July 2001.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.