zbMATH — the first resource for mathematics

Symblicit algorithms for mean-payoff and shortest path in monotonic Markov decision processes. (English) Zbl 1378.68122
Summary: When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the quantitative setting of expected mean-payoff. This algorithm, based on the strategy iteration algorithm of Howard and Veinott, efficiently combines symbolic and explicit data structures, and uses binary decision diagrams as symbolic representation. The aim of this paper is to show that the new data structure of pseudo-antichains (an extension of antichains) provides another interesting alternative, especially for the class of monotonic MDPs. We design efficient pseudo-antichain based symblicit algorithms (with open source implementations) for two quantitative settings: the expected mean-payoff and the stochastic shortest path. For two practical applications coming from automated planning and \(\mathbf {LTL}\) synthesis, we report promising experimental results with reference to both the run time and the memory consumption. We also show that a variant of pseudo-antichains allows to handle the infinite state spaces underlying the qualitative verification of probabilistic lossy channel systems.
68Q87 Probability in computer science (algorithm analysis, random structures, phase transitions, etc.)
68P05 Data structures
68Q60 Specification and verification (program logics, model checking, etc.)
68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)
90C40 Markov and semi-Markov decision processes
Full Text: DOI
[1] Abdulla, PA; Jonsson, B, Verifying programs with unreliable channels, Inf. Comput., 127, 91-101, (1996) · Zbl 0856.68096
[2] Baier, C., Bertrand, N., Schnoebelen, P.: Symbolic verification of communicating systems with probabilistic message losses: liveness and fairness. In: Najm, E., Pradat-Peyre, J., Donzeau-Gouge, V. (eds.) FORTE, volume 4229 of Lecture Notes in Computer Science, pp. 212-227. Springer (2006) · Zbl 1225.68108
[3] Baier, C., Bertrand, N., Schnoebelen, P.: Verifying nondeterministic probabilistic channel systems against \(ω \)-regular linear-time properties. ACM Trans. Comput. Log. 9(1), Article No. 5 (2007) · Zbl 1367.68181
[4] Baier, C.: Principles of model checking. MIT Press, Cambridge (2008) · Zbl 1179.68076
[5] Baier, C; Katoen, J-P; Hermanns, H; Wolf, V, Comparative branching-time semantics for Markov chains, Inf. Comput., 200, 149-214, (2005) · Zbl 1101.68053
[6] Bertrand, N; Schnoebelen, P, Computable fixpoints in well-structured symbolic model checking, Form. Methods Syst. Des., 43, 233-267, (2013) · Zbl 1291.68247
[7] Bertsekas, D.P., Tsitsiklis, J.: Neuro-dynamic programming. Athena Scientific, Anthropological Field Studies, Belmont (1996) · Zbl 0924.68163
[8] Bertsekas, DP; Tsitsiklis, JN, An analysis of stochastic shortest path problems, Math. Oper. Res., 16, 580-595, (1991) · Zbl 0751.90077
[9] Blum, A.L., Langford, J.C.: Probabilistic planning in the graphplan framework. In: Biundo, S., Fox, M. (eds.) Recent Advances in AI Planning, pp. 319-332. Springer (2000) · Zbl 1189.68039
[10] Bohy, A., Bruyère, V., Filiot, E., Jin, N., Raskin, J.-F.: Acacia+, a tool for LTL synthesis. In: Madhusudan, P., Seshia, S.A. (eds.) CAV, volume 7358 of Lecture Notes in Computer Science, pp. 652-657. Springer (2012) · Zbl 0796.60073
[11] Bohy, A., Bruyère, V., Filiot, E., Raskin, J.-F.: Synthesis from LTL specifications with mean-payoff objectives. CoRR, abs/1210.3539 (2012) · Zbl 1381.68149
[12] Bohy, A., Bruyère, V., Filiot, E., Raskin, J.-F.: Synthesis from LTL specifications with mean-payoff objectives. In: Piterman, N., Smolka, S.A. (eds.) TACAS, volume 7795 of Lecture Notes in Computer Science, pp. 169-184. Springer (2013) · Zbl 1381.68149
[13] Bohy, A., Bruyère, V., Raskin, J.: Symblicit algorithms for optimal strategy synthesis in monotonic markov decision processes. In: Chatterjee, K., Ehlers, R., Jha, S. (eds.) SYNT, volume 157 of EPTCS, pp. 51-67 (2014)
[14] Bryant, RE, Graph-based algorithms for Boolean function manipulation, IEEE Trans. Comput., 35, 677-691, (1986) · Zbl 0593.94022
[15] Buchholz, P, Exact and ordinary lumpability in finite Markov chains, J. Appl. Probab., 31, 59-75, (1994) · Zbl 0796.60073
[16] Burch, JR; Clarke, EM; McMillan, KL; Dill, DL; Hwang, LJ, Symbolic model checking: \(10^{20}\) states and beyond, Inf. Comput., 98, 142-170, (1992) · Zbl 0753.68066
[17] Chatterjee, K., Henzinger, T.A., Jobstmann, B., Singh, R.: In: Abdulla, P.A., Leino, K.R.M. (eds.) TACAS, volume 6605 of Lecture Notes in Computer Science. Lecture Notes in Computer Science, pp. 267-271. Springer (2011) · Zbl 0756.68035
[18] Clarke, E.M., Emerson, E.A.: Design and synthesis of synchronization skeletons using branching-time temporal logic. In: Kozen, D. (ed.) Logic of Programs, volume 131 of Lecture Notes in Computer Science, pp. 52-71. Springer (1981) · Zbl 0751.90077
[19] de Alfaro, L.: Computing minimum and maximum reachability times in probabilistic systems. In: Baeten, J.C.M., Mauw, S. (eds.) CONCUR, volume 1664 of Lecture Notes in Computer Science, pp. 66-81. Springer (1999) · Zbl 0949.93082
[20] Derisavi, S; Hermanns, H; Sanders, WH, Optimal state-space lumping in Markov chains, Inf. Process. Lett., 87, 309-315, (2003) · Zbl 1189.68039
[21] Doyen, L., Raskin, J.-F.: Improved algorithms for the automata-based approach to model-checking. In: Grumberg, O., Huth, M. (eds.) TACAS, volume 4424 of Lecture Notes in Computer Science, pp. 451-465. Springer (2007) · Zbl 1186.68285
[22] Fikes, RE; Nilsson, NJ, STRIPS:a new approach to the application of theorem proving to problem solving, Artif. Intell., 2, 189-208, (1972) · Zbl 0234.68036
[23] Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer, Berlin (1997) · Zbl 0934.91002
[24] Filiot, E; Jin, N; Raskin, J-F, Antichains and compositional algorithms for LTL synthesis, Form. Methods Syst. Des., 39, 261-296, (2011) · Zbl 1258.03046
[25] Finkel, A, Decidability of the termination problem for completely specified protocols, Distrib. Comput., 7, 129-135, (1994)
[26] Finkel, A., Schnoebelen, P.: Well-structured transition systems everywhere!. Theor. Comput. Sci. 256(1-2), 63-92 (2001) · Zbl 0973.68170
[27] Fujita, M; McGeer, PC; Yang, JC-Y, Multi-terminal binary decision diagrams: an efficient data structure for matrix representation, Form. Methods Syst. Des., 10, 149-169, (1997)
[28] Hansson, H; Jonsson, B, A logic for reasoning about time and reliability, Form. Asp. Comput., 6, 512-535, (1994) · Zbl 0820.68113
[29] Hartmanns, A.: Modest: a unified language for quantitative models. In: FDL, IEEE, pp. 44-51 (2012)
[30] Higman, G, Ordering by divisibility in abstract algebras, Proc. Lond. Math. Soc., 3, 326-336, (1952) · Zbl 0047.03402
[31] Howard, R.A.: Dynamic Programming and Markov Processes. Wiley, New Jersey (1960) · Zbl 0091.16001
[32] Jansen, D.N., Katoen, J.-P., Oldenkamp, M., Stoelinga, M., Zapreev, I.S.: How fast and fat is your probabilistic model checker? an experimental performance comparison. In: Yorav, K. (ed.) Haifa Verification Conference, volume 4899 of Lecture Notes in Computer Science, pp. 69-85. Springer (2007) · Zbl 0856.68096
[33] Katoen, J-P; Zapreev, IS; Hahn, EM; Hermanns, H; Jansen, DN, The ins and outs of the probabilistic model checker MRMC, Perform. Eval., 68, 90-104, (2011)
[34] Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Van Nostrand Company, Inc, New York (1960) · Zbl 0089.13704
[35] Kwiatkowska, M.Z., Norman, G., Parker, D.: PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV, volume 6806 of Lecture Notes in Computer Science, pp. 585-591. Springer (2011)
[36] Larsen, KG; Skou, A, Bisimulation through probabilistic testing, Inf. Comput., 94, 1-28, (1991) · Zbl 0756.68035
[37] Majercik, SM; Littman, ML; Simmons, RG (ed.); Veloso, MM (ed.); Smith, SF (ed.), Maxplan: a new approach to probabilistic planning, 86-93, (1998), Palo Alto
[38] Pachl, J.K.: Protocol description and analysis based on a state transition model with channel expressions. In: Rudin, H., West, C.H. (eds.) PSTV, Proceedings of the IFIP WG6.1, pp. 207-219. North-Holland (1987)
[39] Paige, R; Tarjan, RE, Three partition refinement algorithms, SIAM J. Comput., 16, 973-989, (1987) · Zbl 0654.68072
[40] Parker, D.: Personal communication, 2013-11-20 · Zbl 0654.68072
[41] Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New Jersey (1994) · Zbl 0829.90134
[42] Russell, S.J., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliffs (1995) · Zbl 0835.68093
[43] Veinott, A.F.: On finding optimal policies in discrete dynamic programming with no discounting. Ann. Math. Stat. 37(5), 1284-1294 (1966) · Zbl 0149.16301
[44] Von Essen, C.: Personal communication, 2013-11-20 · Zbl 0856.68096
[45] Von Essen, C., Jobstmann, B.: Synthesizing efficient controllers. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI, volume 7148 of Lecture Notes in Computer Science, pp. 428-444. Springer (2012) · Zbl 1326.68190
[46] Wimmer, R., Braitling, B., Becker, B., Hahn, E.M., Crouzen, P., Hermanns, H., Dhama, A., Theel, O.E.: Symblicit calculation of long-run averages for concurrent probabilistic systems. In: QEST, IEEE Computer Society, pp. 27-36 (2010)
[47] Wulf, M.D., Doyen, L., Henzinger, T.A., Raskin, J.-F.: Antichains: a new algorithm for checking universality of finite automata. In: Ball, T., Jones, R.B. (eds.) CAV, volume 4144 of Lecture Notes in Computer Science, pp. 17-30. Springer (2006) · Zbl 1188.68171
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.