Partially observable game-theoretic agent programming in Golog. (English) Zbl 1434.68583

Summary: In this paper, we present the agent programming language POGTGolog (Partially Observable Game-Theoretic Golog), which integrates explicit agent programming in Golog with game-theoretic multi-agent planning in partially observable stochastic games. In this framework, we assume one team of cooperative agents acting under partial observability, where the agents may also have different initial belief states and not necessarily the same rewards. POGTGolog allows for specifying a partial control program in a high-level logical language, which is then completed by an interpreter in an optimal way. To this end, we define a formal semantics of POGTGolog programs in terms of Nash equilibria, and we then specify a POGTGolog interpreter that computes one of these Nash equilibria.


68T40 Artificial intelligence for robotics
68N15 Theory of programming languages
68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)
68T42 Agent technology and artificial intelligence
91A15 Stochastic games, stochastic differential games
91A80 Applications of game theory
Full Text: DOI Link


[1] Bacchus, F.; Halpern, J. Y.; Levesque, H. J., Reasoning about noisy sensors and effectors in the situation calculus, Artif. Intell., 111, 1/2, 171-208 (1999) · Zbl 0996.68192
[2] Belle, V.; Lakemeyer, G., Reasoning about imperfect information games in the epistemic situation calculus, (Proceedings AAAI-2010 (2010), AAAI Press), 255-260
[3] Belle, V.; Levesque, H. J., Reasoning about continuous uncertainty in the situation calculus, (Proceedings IJCAI-2013 (2013), IJCAI/AAAI), 732-738
[4] Belle, V.; Levesque, H. J., ALLEGRO: belief-based programming in stochastic dynamical domains, (Proceedings IJCAI-2015 (2015), AAAI Press), 2762-2769
[5] Bordini, R. H.; Fisher, M.; Visser, W.; Wooldridge, M., Model checking rational agents, IEEE Intell. Syst., 19, 5, 46-52 (2004)
[6] Bordini, R. H.; Fisher, M.; Visser, W.; Wooldridge, M., Verifying multi-agent programs by model checking, Auton. Agents Multi-Agent Syst., 12, 2, 239-256 (2006)
[7] Boutilier, C.; Reiter, R.; Price, B., Symbolic dynamic programming for first-order MDPs, (Proceedings IJCAI-2001 (2001), Morgan Kaufmann), 690-700
[8] Boutilier, C.; Reiter, R.; Soutchanski, M.; Thrun, S., Decision-theoretic, high-level agent programming in the situation calculus, (Proceedings AAAI-2000 (2000), AAAI Press/MIT Press), 355-362
[9] De Giacomo, G.; Lesperance, Y.; Pearce, A. R., Situation-calculus-based programs for representing and reasoning about game structures, (Proceedings KR-2010 (2010), AAAI Press), 445-455
[10] De Giacomo, G.; Lesperance, Y.; Pearce, A. R., Situation calculus game structures and GDL, (Proceedings ECAI-2016 (2016), IOS Press), 408-416 · Zbl 1396.91011
[11] Dylla, F.; Ferrein, A.; Lakemeyer, G., Specifying multirobot coordination in ICPGolog – from simulation towards real robots, (Proceedings AOS-2003 (2003))
[12] Emery-Montemerlo, R.; Gordon, G.; Schneider, J.; Thrun, S., Approximate solutions for partially observable stochastic games with common payoffs, (Proceedings AAMAS-2004 (2004), IEEE Computer Society), 136-143
[13] Engesser, T.; Bolander, T.; Mattmüller, R.; Nebel, B., Cooperative epistemic multi-agent planning for implicit coordination, (Proceedings of the M4M Workshop at ICLA-2017 (2017)), 75-90
[14] Engesser, T.; Mattmüller, R.; Nebel, B.; Thielscher, M., Game description language and dynamic epistemic logic compared, (Proceedings IJCAI-2018 (2018), IJCAI), 1795-1802
[15] Farinelli, A.; Finzi, A.; Lukasiewicz, T., Team programming in Golog under partial observability, (Proceedings IJCAI-2007 (2007), Morgan Kaufmann), 2097-2102
[16] Ferrein, A.; Fritz, C.; Lakemeyer, G., On-line decision-theoretic Golog for unpredictable domains, (Proceedings KI-2004. Proceedings KI-2004, LNCS/LNAI, vol. 3238 (2004), Springer), 322-336
[17] Ferrein, A.; Fritz, C.; Lakemeyer, G., Using Golog for deliberation and team coordination in robotic soccer, Künstl. Intell., 1, 24-43 (2005)
[18] Finzi, A.; Lukasiewicz, T., Game-theoretic agent programming in Golog, (Proceedings ECAI-2004 (2004), IOS Press), 23-27
[19] Finzi, A.; Lukasiewicz, T., Relational Markov games, (Proceedings JELIA-2004. Proceedings JELIA-2004, LNCS/LNAI, vol. 3229 (2004), Springer), 320-333 · Zbl 1111.91305
[20] Finzi, A.; Lukasiewicz, T., Game-theoretic agent programming in Golog under partial observability, (Proceedings GTDT-2005 (2005))
[21] Finzi, A.; Lukasiewicz, T., Game-theoretic Golog under partial observability, (Proceedings AAMAS-2005 (2005), ACM Press), 1301-1302
[22] Finzi, A.; Lukasiewicz, T., Game-theoretic agent programming in Golog under partial observability, (Proceedings KI-2006. Proceedings KI-2006, LNCS/LNAI, vol. 4314 (2007), Springer), 113-127
[23] Finzi, A.; Lukasiewicz, T., Adaptive multi-agent programming in GTGolog, (Proceedings ECAI-2006 (2006), IOS Press), 753-754
[24] Finzi, A.; Lukasiewicz, T., Adaptive multi-agent programming in GTGolog, (Proceedings KI-2006. Proceedings KI-2006, LNCS/LNAI, vol. 4314 (2007), Springer), 389-403
[25] Finzi, A.; Pirri, F.; Reiter, R., Open-world planning in the situation calculus, (Proceedings AAAI-2000 (2000), AAAI Press), 754-760
[26] Finzi, A.; Pirri, F., Combining probabilities, failures and safety in robot control, (Proceedings IJCAI-2001 (2001), Morgan Kaufmann), 1331-1336
[27] Gardiol, N. H.; Pack Kaelbling, L., Envelope-based planning in relational MDPs, (Proceedings NIPS-2003 (2003), MIT Press)
[28] Gmytrasiewicz, P. J.; Doshi, P., Interactive POMDPs: properties and preliminary results, (Proc. AAMAS-2004 (2004), IEEE Computer Society), 1374-1375
[29] Goldman, C. V.; Zilberstein, S., Optimizing information exchange in cooperative multi-agent systems, (Proceedings AAMAS-2003 (2003), ACM Press), 137-144
[30] Govindan, S.; Wilson, R., A global Newton method to compute Nash equilibria, J. Econ. Theory, 110, 1, 65-86 (2003) · Zbl 1042.91001
[31] Govindan, S.; Wilson, R., Computing Nash equilibria by iterated polymatrix approximation, J. Econ. Dyn. Control, 28, 1229-1241 (2004) · Zbl 1200.91019
[32] Guestrin, C.; Koller, D.; Gearhart, C.; Kanodia, N., Generalizing plans to new environments in relational MDPs, (Proceedings IJCAI-2003 (2003), Morgan Kaufmann), 1003-1010
[33] Herzig, A.; Troquard, N., Knowing how to play: uniform choices in logics of agency, (Proceedings AAMAS-2006 (2006), ACM Press), 209-216
[34] Horák, K.; Bosanský, B.; Pechoucek, M., Heuristic search value iteration for one-sided partially observable stochastic games, (Proceedings AAAI-2017 (2017), AAAI Press), 558-564
[35] Kearns, M.; Mansour, Y.; Singh, S., Fast planning in stochastic games, (Proceedings UAI-2000 (2000), Morgan Kaufmann), 309-316
[36] Kearns, M.; Littman, M. L.; Singh, S., Graphical models for game theory, (Proceedings UAI-2001 (2001), Morgan Kaufmann), 253-260
[37] Kominis, F.; Geffner, H., Beliefs in multiagent planning: from one agent to many, (Proceedings AAAI-2015 (2015), AAAI Press), 147-155
[38] Kumar, A.; Zilberstein, S., Dynamic programming approximations for partially observable stochastic games, (Proceedings FLAIRS-2009 (2009)), 547-552
[39] La Mura, P., Game networks, (Proceedings UAI-2000 (2000), Morgan Kaufmann), 335-342
[40] Lemke, C. E.; Howson, J. T., Equilibrium points of bimatrix games, J. Soc. Ind. Appl. Math., 12, 2, 413-423 (1964) · Zbl 0128.14804
[41] Littman, M. L., Markov games as a framework for multi-agent reinforcement learning, (Proceedings ICML-1994 (1994), Morgan Kaufmann), 157-163
[42] Marthi, B.; Russell, S. J.; Latham, D.; Guestrin, C., Concurrent hierarchical reinforcement learning, (Proceedings IJCAI-2005 (2005), Professional Book Center), 779-785
[43] Martin, M.; Geffner, H., Learning generalized policies from planning examples using concept languages, Appl. Intell., 20, 1, 9-19 (2004) · Zbl 1078.68713
[44] Martin, Y.; Narasamdya, I.; Thielscher, M., Knowledge of other agents and communicative actions in the fluent calculus, (Proceedings KR-2004 (2004), AAAI Press), 623-633
[45] McCarthy, J.; Hayes, P. J., Some philosophical problems from the standpoint of artificial intelligence, (Machine Intelligence, vol. 4 (1969)), 463-502 · Zbl 0226.68044
[46] McKelvey, R. D.; McLennan, A. M.; Turocy, T. L., Gambit: software tools for game theory (2016), available at
[47] Muise, C.; Belle, V.; Felli, P.; McIlraith, S. A.; Miller, T.; Pearce, A. R.; Sonenberg, L., Planning over multi-agent epistemic states: a classical planning approach, (Proceedings AAAI-2015 (2015), AAAI Press), 3327-3334
[48] Nair, R.; Tambe, M.; Yokoo, M.; Pynadath, D. V.; Marsella, S., Taming decentralized POMDPs: towards efficient policy computation for multiagent settings, (Proceedings IJCAI-2003 (2003), Morgan Kaufmann), 705-711
[49] Owen, G., Game Theory (1982), Academic Press · Zbl 0159.49201
[50] Pack Kaelbling, L.; Littman, M. L.; Cassandra, A. R., Planning and acting in partially observable stochastic domains, Artif. Intell., 101, 1/2, 99-134 (1998) · Zbl 0908.68165
[51] Peshkin, L.; Kim, K.-E.; Meuleau, N.; Pack Kaelbling, L., Learning to cooperate via policy search, (Proceedings UAI-2000 (2000), Morgan Kaufmann), 489-496
[52] Pinto, J., Integrating discrete and continuous change in a logical framework, Comput. Intell., 14, 1, 39-88 (1998)
[53] Pirri, F.; Reiter, R., Some contributions to the metatheory of the situation calculus, J. ACM, 46, 3, 325-362 (1999) · Zbl 1065.68627
[54] Poole, D., The independent choice logic for modelling multiple agents under uncertainty, Artif. Intell., 94, 1/2, 7-56 (1997) · Zbl 0902.03017
[55] Puterman, M. L., Markov Decision Processes: Discrete Stochastic Dynamic Programming (1994), Wiley · Zbl 0829.90134
[56] Pynadath, D. V.; Tambe, M., The communicative multiagent team decision problem: analyzing teamwork theories and models, J. Artif. Intell. Res., 16, 389-423 (2002) · Zbl 1056.68137
[57] Rens, G.; Meyer, T. A.; Lakemeyer, G., A logic for reasoning about decision-theoretic projections, (Proceedings ICAART-2015 (Revised Selected Papers) (2015)), 79-99
[58] Reiter, R., The frame problem in the situation calculus: a simple solution (sometimes) and a completeness result for goal regression, (Artificial Intelligence and Mathematical Theory of Computation: Papers in Honor of John McCarthy (1991), Academic Press), 359-380 · Zbl 0755.68124
[59] Reiter, R., Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems (2001), MIT Press · Zbl 1018.03022
[60] Roth, M.; Simmons, R.; Veloso, M., Reasoning about joint beliefs for execution-time communication decisions, (Proceedings AAMAS-2005 (2005), ACM Press), 786-793
[61] Scherl, R. B.; Levesque, H. J., The frame problem and knowledge-producing actions, (Proceedings AAAI-1993 (1993), AAAI Press/MIT Press), 689-697
[62] Shapiro, S.; Lesperance, Y.; Levesque, H. J., The cognitive agents specification language and verification environment for multiagent systems, (Proceedings AAMAS-2002 (2002), ACM Press), 19-26
[63] Sanner, S.; Kersting, K., Symbolic dynamic programming for first-order POMDPs, (Proceedings AAAI-2010 (2010), AAAI Press), 1140-1146
[64] van der Wal, J., Stochastic Dynamic Programming, Mathematical Centre Tracts, vol. 139 (1981), Morgan Kaufmann · Zbl 0462.90055
[65] Vickrey, D.; Koller, D., Multi-agent algorithms for solving graphical games, (Proceedings AAAI-2002 (2002), AAAI Press), 345-351
[66] von Neumann, J.; Morgenstern, O., The Theory of Games and Economic Behavior (1947), Princeton University Press · Zbl 1241.91002
[67] Wooldridge, M., Reasoning About Rational Agents (2001), MIT Press: MIT Press Cambridge, MA · Zbl 0998.68094
[68] Xiong, L.; Liu, Y., Strategy representation and reasoning for incomplete information concurrent games in the situation calculus, (Proceedings IJCAI-2016 (2016), AAAI Press), 1322-1329
[69] Xiong, L.; Liu, Y., Strategy representation and reasoning in the situation calculus, (Proceedings ECAI-2016 (2016), AAAI Press), 982-990 · Zbl 1403.68276
[70] Yoon, S. W.; Fern, A.; Givan, R., Inductive policy selection for first-order MDPs, (Proceedings UAI-2002 (2002), Morgan Kaufmann), 568-576
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.