Analysis of an identification algorithm arising in the adaptive estimation of Markov chains. (English) Zbl 0685.93063

The paper is devoted to the adaptive estimation problem for partially observable finite-state Markov chains. The authors describe an algorithm which utilizes the recursive equation characterizing the conditional distribution of the state of the Markov chain, given the past observations. Several important analytical properties of this algorithm are studied. In particular, an interesting connection is established with the ordinary differential equation method for stochastic approximations. This allows to give a deep analysis of the algorithms suggested here. We find in the paper clearly formulated statements as well as their detailed proofs. Some useful related topics are also discussed.
Reviewer: J.Stoyanov


93E10 Estimation and detection in stochastic control theory
93C40 Adaptive control/observation systems
60J10 Markov chains (discrete-time Markov processes on discrete state spaces)
Full Text: DOI


[1] K. J. Åström, Optimal control of Markov processes with incomplete state information,J. Math. Anal. Appl.,10 (1965), 174–205. · Zbl 0137.35803
[2] J. S. Baras and A. J. Dorsey, Stochastic control of two partially observed competing queues,IEEE Trans. Automat. Control,26 (1981), 1106–1117. · Zbl 0469.93078
[3] J. S. Baras and A. J. Dorsey, Adaptive control of two competing queues,Proceedings of the Second IEEE Annual Joint Conference on Infocom, San Diego, CA, 1983, pp. 427–435.
[4] Y. M. El-Fattah, Gradient approach for recursive estimation and control in finite Markov chains,Adv. in Appl. Probab.,13 (1981), 778–803. · Zbl 0475.60051
[5] H. Furstenberg and H. Kesten, Products of random matrices,Ann. Math. Statist.,31 (1960), 457–469. · Zbl 0137.35501
[6] G. C. Goodwin, P. J. Ramadge, and P. E. Caines, A globally convergent adaptive predictor,IEEE Trans. Automat. Control,25 (1980), 449–456. · Zbl 0429.93034
[7] O. Hernández-Lerma and S. I. Marcus, Adaptive control of service in queueing systems,Systems Control Lett.,3 (1983), 283–289. · Zbl 0534.90037
[8] O. Hernández-Lerma and S. I. Marcus, Optimal adaptive control of priority assignment in queueing systems,Systems Control Lett.,4 (1984), 65–72. · Zbl 0529.90045
[9] K. Hsu and S. I. Marcus, A general martingale approach to discrete-time stochastic control and estimation,IEEE Trans. Automat. Control,24 (1979), 580–583. · Zbl 0422.93086
[10] M. Iosifescu and R. Theodorescu,Random Processes and Learning, Springer-Verlag, Berlin, 1969. · Zbl 0194.51101
[11] T. Kaijser, A limit theorem for partially observed Markov chains,Ann. Probab.,3 (1975), 677–696. · Zbl 0315.60038
[12] T. Kaijser, A limit theorem for Markov chains in compact spaces with applications to products of random matrices,Duke Math. J.,45 (1978), 311–349. · Zbl 0389.60052
[13] R. L. Kashyap, Identification of a transition matrix of a Markov chain from noisy measurements of state,IEEE Trans. Inform. Theory,16 (1970), 161–166. · Zbl 0193.45802
[14] M. Kolonko, The average-optimal adaptive control of a Markov renewal model in presence of an unknown parameter,Math. Operationforsch. Statist. Ser. Optim.,13 (1982), 567–591. · Zbl 0518.90092
[15] P. R. Kumar, A survey of some results in stochastic adaptive control,SIAM J. Control Optim.,23 (1985), 329–380. · Zbl 0571.93038
[16] H. J. Kushner, Stochastic approximation with discontinuous dynamics and state dependent noise: w.p.1 and weak convergence,J. Math. Anal. Appl.,82 (1981), 527–542. · Zbl 0473.62074
[17] H. J. Kushner, An averaging method for stochastic approximations with discontinuous dynamics, constraints, and state dependent noise, inRecent Advances in Statistics (M. H. Rizvi, J. S. Rustagi, and O. Siegmund, eds.), pp. 211–235, Academic Press, New York, 1983.
[18] H. J. Kushner,Approximation and Weak Convergence Methods for Random Processes with Application to Stochastic System Theory, MIT Press, Cambridge, MA, 1984. · Zbl 0551.60056
[19] H. J. Kushner and D. S. Clark,Stochastic Approximation Methods for Constrained and Unconstrained Systems, Applied Mathematical Sciences, Vol. 26, Springer-Verlag, New York, 1978. · Zbl 0381.60004
[20] H. J. Kushner and A. Shwartz, An invariant measure approach to the convergence of stochastic approximations with state dependent noise,SIAM J. Control Optim.,22 (1984), 13–27. · Zbl 0535.62069
[21] L. Ljung and T. Söderström,Theory and Practice of Recursive Identification, MIT Press, Cambridge, MA, 1983.
[22] D.-J. Ma and A. M. Makowski, A Simple Problem of Flow Control, II: Implementation of Threshold Policies via Stochastic Approximations, Technical Report 87-100, Systems Research Center, University of Maryland, 1987.
[23] M. Métivier, On stochastic algorithms in adaptive filtering, inStochastic Processes and Their Applications (K. Ito and T. Hida, eds.), pp. 134–156, Lecture Notes in Mathematics, Vol. 1203, Springer-Verlag, Berlin, 1986.
[24] M. Métivier and P. Priouret, Théorèmes de convergence presque sure pure une classe d’algorithmes stochastiques à pas décroissant,Probab. Theory Rel. Fields,74 (1987), 403–428. · Zbl 0588.62153
[25] M. Schäl, Estimation and control in discounted dynamic programing,Stochastics,20 (1987), 51–71. · Zbl 0621.90092
[26] J. H. van Schuppen, Convergence results for continuous-time adaptive stochastic filtering algorithms,J. Math. Anal. Appl.,96 (1983), 209–225. · Zbl 0523.93057
[27] E. Seneta,Nonnegative Matrices and Markov Chains, Springer-Verlag, New York, 1981. · Zbl 0471.60001
[28] A. Shwartz, Convergence of Stochastic Approximations: The Invariant Measure Approach, Ph.D. Thesis, Division of Engineering, Brown University, Providence, RI, 1982.
[29] A. Shwartz and A. M. Makowski, An optimal adaptive scheme for two competing queues with constraints, inProceedings of the 7th International Conference on Analysis and Optimization of Systems, Antibes, 1986, pp. 515–532.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.