×

Learning action models from plan examples using weighted MAX-SAT. (English) Zbl 1168.68555

Summary: AI planning requires the definition of action models using a formal action and plan description language, such as the standard Planning Domain Definition Language (PDDL), as input. However, building action models from scratch is a difficult and time-consuming task, even for experts. In this paper, we develop an algorithm called ARMS (action-relation modelling system) for automatically discovering action models from a set of successful observed plans. Unlike the previous work in action-model learning, we do not assume complete knowledge of states in the middle of observed plans. In fact, our approach works when no or partial intermediate states are given. These example plans are obtained by an observation agent who does not know the logical encoding of the actions and the full state information between the actions. In a real world application, the cost is prohibitively high in labelling the training examples by manually annotating every state in a plan example from snapshots of an environment. To learn action models, ARMS gathers knowledge on the statistical distribution of frequent sets of actions in the example plans. It then builds a weighted propositional satisfiability (weighted MAX-SAT) problem and solves it using a MAX-SAT solver. We lay the theoretical foundations of the learning problem and evaluate the effectiveness of ARMS empirically.

MSC:

68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Agrawal, R.; Srikant, R., Fast algorithms for mining association rules, (Proceedings of the 20th International Conference on Very Large Data Bases (VLDB). Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile (1994), Morgan Kaufmann), 487-499
[2] E. Amir, Learning partially observable deterministic action models, in: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2005), Edinburgh, Scotland, UK, August 2005, pp. 1433-1439; E. Amir, Learning partially observable deterministic action models, in: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 2005), Edinburgh, Scotland, UK, August 2005, pp. 1433-1439
[3] Bain, M.; Sammut, C., A framework for behavioural cloning, Machine Intelligence, 15 (1996)
[4] Benson, S., Inductive learning of reactive action models, (Proceedings of the International Conference on Machine Learning (ICML 1995) (1995), Stanford University: Stanford University Stanford, CA), 47-54
[5] J. Blythe, J. Kim, S. Ramachandran, Y. Gil, An integrated environment for knowledge acquisition, in: Proceedings of the 2001 International Conference on Intelligent User Interfaces (IUI2001), Santa Fe, NM, 2001, pp. 13-20; J. Blythe, J. Kim, S. Ramachandran, Y. Gil, An integrated environment for knowledge acquisition, in: Proceedings of the 2001 International Conference on Intelligent User Interfaces (IUI2001), Santa Fe, NM, 2001, pp. 13-20
[6] Borchers, B.; Furman, J.
[7] Borchers, B.; Furman, J., A two-phase exact algorithm for MAX-SAT and weighted MAX-SAT problems, Journal of Combinatorial Optimization, 2, 4, 299-306 (1999) · Zbl 0954.90026
[8] Davis, M.; Putnam, H., A computing procedure for quantification theory, Journal of The Association for Computing Machinery, 7, 201-215 (1960) · Zbl 0212.34203
[9] P. Domingos, S. Kok, H. Poon, M. Richardson, P. Singla, Unifying logical and statistical AI, in: Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI 2006), Boston, MA, July 2006; P. Domingos, S. Kok, H. Poon, M. Richardson, P. Singla, Unifying logical and statistical AI, in: Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI 2006), Boston, MA, July 2006 · Zbl 1401.68302
[10] K. Erol, J.A. Hendler, D.S. Nau, Htn planning: Complexity and expressivity, in: Proceedings of the National Conference on Artificial Intelligence (AAAI 1994), Seattle, WA, 1994, pp. 1123-1128; K. Erol, J.A. Hendler, D.S. Nau, Htn planning: Complexity and expressivity, in: Proceedings of the National Conference on Artificial Intelligence (AAAI 1994), Seattle, WA, 1994, pp. 1123-1128
[11] Kautz, H.
[12] W.H. Evans, J.C. Ballegeer, N.H. Duyet, ADL: An algorithmic design language for integrated circuit synthesis, in: Proceedings of the 21st Design Automation Conference on Design Automation, 1984; W.H. Evans, J.C. Ballegeer, N.H. Duyet, ADL: An algorithmic design language for integrated circuit synthesis, in: Proceedings of the 21st Design Automation Conference on Design Automation, 1984
[13] Fikes, R. E.; Nilsson, N. J., Strips: A new approach to the application of theorem proving to problem solving, Artificial Intelligence, 2, 189-208 (1971) · Zbl 0234.68036
[14] Fox, M.; Long, D., PDDL2.1: An extension to pddl for expressing temporal planning domains, Journal of Artificial Intelligence Research, 20, 61-124 (2003) · Zbl 1036.68093
[15] Garey, M. R.; Johnson, D. S., Computers and Intractability: A Guide to the Theory of NP-Completeness (1979), W.H. Freeman and Company · Zbl 0411.68039
[16] A. Garland, N. Lesh, Plan evaluation with incomplete action descriptions, in: Proceedings of the Eighteenth National Conference on AI (AAAI 2002), Edmonton, Alberta, Canada, 2002, pp. 461-467; A. Garland, N. Lesh, Plan evaluation with incomplete action descriptions, in: Proceedings of the Eighteenth National Conference on AI (AAAI 2002), Edmonton, Alberta, Canada, 2002, pp. 461-467
[17] M. Ghallab, A. Howe, C. Knoblock, D. McDermott, A. Ram, M. Veloso, D. Weld, D. Wilkins, PDDL—the planning domain definition language, 1998; M. Ghallab, A. Howe, C. Knoblock, D. McDermott, A. Ram, M. Veloso, D. Weld, D. Wilkins, PDDL—the planning domain definition language, 1998
[18] Y. Gil, Learning by experimentation: Incremental refinement of incomplete planning domains, in: Eleventh Intl. Conf. on Machine Learning, 1994, pp. 87-95; Y. Gil, Learning by experimentation: Incremental refinement of incomplete planning domains, in: Eleventh Intl. Conf. on Machine Learning, 1994, pp. 87-95
[19] Goemans, M. X.; Williamson, D. P., Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming, Journal of the ACM, 42, 1115-1145 (1995) · Zbl 0885.68088
[20] Grunwald, P. D.; Myung, I. J.; Pitt, M. A., Advances in Minimum Description Length Theory and Applications (2005), MIT Press: MIT Press Cambridge, MA
[21] Ilghami, O.; Munoz-Avila, H.; Nau, D. S.; Aha, D. W., Learning preconditions for planning from plan traces and HTN structure, Journal of Artificial Intelligence Research, 20, 379-404 (2003)
[22] O. Ilghami, H. Munoz-Avila, D.S. Nau, D.W. Aha, Learning preconditions for planning from plan traces and htn structure, in: Proceedings of the International Conference on Machine Learning (ICML 2005), Bonn, Germany, 2005; O. Ilghami, H. Munoz-Avila, D.S. Nau, D.W. Aha, Learning preconditions for planning from plan traces and htn structure, in: Proceedings of the International Conference on Machine Learning (ICML 2005), Bonn, Germany, 2005
[23] O. Ilghami, D.S. Nau, H. Munoz-Avila, Camel: Learning method preconditions for htn planning, in: Proceedings of the Sixth International Conference on AI Planning and Scheduling AIPS-02, Toulouse, France, 2002, pp. 168-178; O. Ilghami, D.S. Nau, H. Munoz-Avila, Camel: Learning method preconditions for htn planning, in: Proceedings of the Sixth International Conference on AI Planning and Scheduling AIPS-02, Toulouse, France, 2002, pp. 168-178
[24] Jeroslow, R. G.; Wang, J., Solving propositional satisfiability problems, Annals of Mathematics and AI, 1, 167-187 (1990) · Zbl 0878.68107
[25] H. Kautz, B. Selman, Pushing the envelope: Planning, propositional logic, and stochastic search, in: Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI 1996), Portland, OR, 1996, pp. 1194-1201; H. Kautz, B. Selman, Pushing the envelope: Planning, propositional logic, and stochastic search, in: Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI 1996), Portland, OR, 1996, pp. 1194-1201
[26] H. Kautz, B. Selman, Y. Jiang, A general stochastic approach to solving problems with hard and soft constraints, in: The Satisfiability Problem: Theory and Applications, 1997; H. Kautz, B. Selman, Y. Jiang, A general stochastic approach to solving problems with hard and soft constraints, in: The Satisfiability Problem: Theory and Applications, 1997 · Zbl 0891.68100
[27] Lau, T.; Domingos, P.; Weld, D. S., Version space algebra and its application to programming by demonstration, (Proc. 17th International Conf. on Machine Learning (2000), Morgan Kaufmann: Morgan Kaufmann San Francisco, CA), 527-534
[28] Loveland, D. W., Automated Theorem Proving: A Logical Basis (1978), North-Holland: North-Holland New York · Zbl 0364.68082
[29] T. Leo McCluskey, D. Liu, R.M. Simpson, GIPO II: HTN planning in a tool-supported knowledge engineering environment, in: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS 2003), Trento, Italy, 2003, pp. 92-101; T. Leo McCluskey, D. Liu, R.M. Simpson, GIPO II: HTN planning in a tool-supported knowledge engineering environment, in: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS 2003), Trento, Italy, 2003, pp. 92-101
[30] M.W. Moskewicz, C.F. Madigan, Y. Zhao, L. Zhang, S. Malik, Chaff: Engineering an efficient sat solver, in: Proceedings of the 38th Design Automation Conference (DAC), 2001; M.W. Moskewicz, C.F. Madigan, Y. Zhao, L. Zhang, S. Malik, Chaff: Engineering an efficient sat solver, in: Proceedings of the 38th Design Automation Conference (DAC), 2001
[31] Myers, K. L.; Jarvis, P.; Tyson, W. M.; Wolverton, M., Mixed-initiative framework for robust plan sketching, (Thirteenth International Conference on Automated Planning and Scheduling (ICAPS-03). Thirteenth International Conference on Automated Planning and Scheduling (ICAPS-03), BC, Canada (2003), AAAI Press), 256-266
[32] Nau, D. S.; Au, T.-C.; Ilghami, O.; Kuter, U.; Murdock, J. W.; Wu, D.; Yaman, F., Shop2: An HTN planning system, Journal of Artificial Intelligence Research, 20, 379-404 (2003) · Zbl 1058.68106
[33] Nau, D. S.; Au, T.-C.; Ilghami, O.; Kuter, U.; Murdock, J. W.; Wu, D.; Yaman, F., Applications of shop and shop2, IEEE Intelligent Systems, 20, 2, 34-41 (2005)
[34] T. Oates, P.R. Cohen, Searching for planning operators with context-dependent and probabilistic effects, in: Proceedings of the Thirteenth National Conference on AI (AAAI 1996), Portland, OR, 1996, pp. 865-868; T. Oates, P.R. Cohen, Searching for planning operators with context-dependent and probabilistic effects, in: Proceedings of the Thirteenth National Conference on AI (AAAI 1996), Portland, OR, 1996, pp. 865-868
[35] M. Richardson, P. Domingos, Markov logic networks, Technical Report, 2004; M. Richardson, P. Domingos, Markov logic networks, Technical Report, 2004
[36] Richardson, M.; Domingos, P., Markov logic networks, Machine Learning, 62, 1-2, 107-136 (July 2006)
[37] Sablon, G.; Boulanger, D., Using the event calculus to integrate planning and learning in an intelligent autonomous agent, (Current Trends in AI Planning (1994), IOS Press), 254-265
[38] B. Selman, H. Kautz, An empirical study of greedy local search for satisfiability testing, in: Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI-93), Washington, DC, 1993, pp. 46-51; B. Selman, H. Kautz, An empirical study of greedy local search for satisfiability testing, in: Proceedings of the Eleventh National Conference on Artificial Intelligence (AAAI-93), Washington, DC, 1993, pp. 46-51
[39] Selman, B.; Kautz, H.; Cohen, B., Local search strategies for satisfiability testing, Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, 26 (October 11-13, 1993)
[40] D. Shahaf, E. Amir, Learning partially observable action schemas, in: Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI 2006), Boston, MA, July 2006; D. Shahaf, E. Amir, Learning partially observable action schemas, in: Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI 2006), Boston, MA, July 2006
[41] D. Shahaf, A. Chang, E. Amir, Learning partially observable action models: Efficient algorithms, in: Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI 2006), Boston, MA, July 2006; D. Shahaf, A. Chang, E. Amir, Learning partially observable action models: Efficient algorithms, in: Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI 2006), Boston, MA, July 2006
[42] Shen, W., Autonomous Learning from the Environment (1994), Computer Science Press/W.H. Freeman and Company
[43] X. Wang, Learning by observation and practice: An incremental approach for planning operator acquisition, in: Proceedings of the Twelfth International Conference on Machine Learning (ICML 1995), 1995, pp. 549-557; X. Wang, Learning by observation and practice: An incremental approach for planning operator acquisition, in: Proceedings of the Twelfth International Conference on Machine Learning (ICML 1995), 1995, pp. 549-557
[44] E. Winner, M. Veloso, Analyzing plans with conditional effects, in: Proceedings of the Sixth International Conference on AI Planning and Scheduling (AIPS 2002), Toulouse, France, 2002.; E. Winner, M. Veloso, Analyzing plans with conditional effects, in: Proceedings of the Sixth International Conference on AI Planning and Scheduling (AIPS 2002), Toulouse, France, 2002.
[45] Yang, Q., Formalizing planning knowledge for hierarchical planning, Computational Intelligence Journal, 6, 2, 12-24 (1990)
[46] Q. Yang, K. Wu, Y. Jiang, Learning action models from plan examples with incomplete knowledge, in: Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling (ICAPS 2005), Monterey, CA, 2005, pp. 241-250; Q. Yang, K. Wu, Y. Jiang, Learning action models from plan examples with incomplete knowledge, in: Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling (ICAPS 2005), Monterey, CA, 2005, pp. 241-250
[47] J. Yin, D. Shen, Q. Yang, Z.-N. Li, Activity recognition through goal-based segmentation, in: Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-05), Pittsburgh, PA, 2005, pp. 28-34; J. Yin, D. Shen, Q. Yang, Z.-N. Li, Activity recognition through goal-based segmentation, in: Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI-05), Pittsburgh, PA, 2005, pp. 28-34
[48] J. Yin, Q. Yang, Integrating hidden Markov models and spectral analysis for sensory time series clustering, in: Proceedings of the Fifth IEEE International Conference on Data Mining, Houston, TX, 2005; J. Yin, Q. Yang, Integrating hidden Markov models and spectral analysis for sensory time series clustering, in: Proceedings of the Fifth IEEE International Conference on Data Mining, Houston, TX, 2005
[49] H. Zhang, SATO: An efficient propositional prover, in: Proceedings of the 14th International Conference on Automated Deduction (CADE 1997), Townsville, North Queensland, Australia, 1997, pp. 272-275; H. Zhang, SATO: An efficient propositional prover, in: Proceedings of the 14th International Conference on Automated Deduction (CADE 1997), Townsville, North Queensland, Australia, 1997, pp. 272-275
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.