×

Learning hierarchical task network domains from partially observed plan traces. (English) Zbl 1405.68333

Summary: Hierarchical Task Network (HTN) planning is an effective yet knowledge intensive problem-solving technique. It requires humans to encode knowledge in the form of methods and action models. Methods describe how to decompose tasks into subtasks and the preconditions under which those methods are applicable whereas action models describe how actions change the world. Encoding such knowledge is a difficult and time-consuming process, even for domain experts. In this paper, we propose a new learning algorithm, called HTNLearn, to help acquire HTN methods and action models. HTNLearn receives as input a collection of plan traces with partially annotated intermediate state information, and a set of annotated tasks that specify the conditions before and after the tasks’ completion. In addition, plan traces are annotated with potentially empty partial decomposition trees that record the processes of decomposing tasks to subtasks. HTNLearn outputs are a collection of methods and action models. HTNLearn first encodes constraints about the methods and action models as a constraint satisfaction problem, and then solves the problem using a weighted MAX-SAT solver. HTNLearn can learn methods and action models simultaneously from partially observed plan traces (i.e., plan traces where the intermediate states are partially observable). We test HTNLearn in several HTN domains. The experimental results show that our algorithm HTNLearn is both effective and efficient.

MSC:

68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)
68T05 Learning and adaptive systems in artificial intelligence

Software:

SUPPLE; TRAMP
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Amir, E., Learning partially observable deterministic action models, (Proceedings of IJCAI (2005)), 1433-1439
[2] Anderson, K. M.; Sherba, S. A.; Lepthien, W. V., Towards largescale information integration, (Proceedings of International Conference on Software Engineering. Proceedings of International Conference on Software Engineering, ICSE-02 (2002))
[3] Asuncion, H. U.; Asuncion, A. U.; Taylor, R. N., Software traceability with topic modeling, (Proceedings of International Conference on Software Engineering. Proceedings of International Conference on Software Engineering, ICSE-10 (2010))
[4] Baum, J.; Nicholson, A. E.; Dix, T. I., Proximity-based non-uniform abstractions for approximate planning, J. Artif. Intell. Res., 43, 477-522 (2012) · Zbl 1237.68183
[5] Benson, S., Inductive learning of reactive action models, (Proceedings of the Twelfth International Conference on Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, ICML-95 (1995)), 47-54
[6] Blythe, J.; Kim, J.; Ramachandran, S.; Gil, Y., An integrated environment for knowledge acquisition, (Proceedings of IUI (2001)), 13-20
[7] Borchers, B.; Furman, J., A two-phase exact algorithm for MAX-SAT and weighted MAX-SAT problems, J. Comb. Optim., 2, 4, 299-306 (1998) · Zbl 0954.90026
[8] Botea, A.; Enzenberger, M.; Muller, M.; Schaeffer, J., Macro-ff: improving ai planning with automatically learned macro-operators, J. Artif. Intell. Res., 24, 581-621 (2005) · Zbl 1080.68657
[9] Bryant, R. E., Symbolic boolean manipulation with ordered binary-decision diagrams, ACM Comput. Surv., 24, 3, 293-318 (1992)
[10] Chrisman, L., Abstract probabilistic modeling of action, (Proceedings of the First International Conference on Artificial Intelligence Planning Systems. Proceedings of the First International Conference on Artificial Intelligence Planning Systems, AIPS-92 (1992)), 28-36
[11] Coles, A.; Smith, A., Marvin: a heuristic search planner with online macro-action learning, J. Artif. Intell. Res., 28, 119-156 (2007) · Zbl 1182.68231
[12] Cresswell, S.; McCluskey, T. L.; West, M. M., Acquisition of object-centred domain models from planning examples, (Proceedings of the Nineteenth International Conference on Automated Planning and Scheduling. Proceedings of the Nineteenth International Conference on Automated Planning and Scheduling, ICAPS-09 (2009))
[13] Currie, K.; Tate, A., O-plan: the open planning architecture, Artif. Intell., 52, 1, 49-86 (1991)
[14] Erol, K.; Hendler, J.; Nau, D. S., Htn planning: complexity and expressivity, (AAAI, vol. 94 (1994)), 1123-1128
[15] Estlin, T. A.; Mooney, R. J., Multi-strategy learning of search control for partial-order planning, (Proceedings of the Thirteenth National Conference on Artificial Intelligence (1996)), 843-848
[16] Fikes, R.; Nilsson, N. J., STRIPS: a new approach to the application of theorem proving to problem solving, Artif. Intell., 189-208 (1971) · Zbl 0234.68036
[17] Fikes, R. E.; Hart, P. E.; Niisson, N. J., Learning and executing generalized robot plans, Artif. Intell., 3, 251-288 (1972)
[18] Gajos, K. Z.; Weld, D. S.; Wobbrock, J. O., Decision-theoretic user interface generation, (Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI-08 (2008)), 1532-1536
[19] Garland, A.; Ryall, K.; Rich, C., Learning hierarchical task models by defining and refining examples, (Proceedings of the First International Conference on Knowledge Capture. Proceedings of the First International Conference on Knowledge Capture, K-CAP (2001)), 44-51
[20] Ghallab, M.; Nau, D.; Traverso, P., Automated Planning: Theory and Practice (2004), Morgan Kaufmann · Zbl 1074.68613
[21] Gil, Y., Learning by experimentation: incremental refinement of incomplete planning domains, (Proceedings of the Eleventh International Conference on Machine Learning. Proceedings of the Eleventh International Conference on Machine Learning, ICML-94 (1994)), 87-95
[22] He, R.; Brunskill, E.; Brunskill, E., Puma: planning under uncertainty with macro-actions, (Proceedings of AAAI (2010)), 1089-1095
[23] Hernandez, T.; Kambhampati, S., Integration of biological sources: current systems and challenges ahead, SIGMOD Rec., 33, 3, 51-60 (2004)
[24] Hoffmann, J.; Bertoli, P.; Pistore, M., Web service composition as planning revised: in between background theories and initial state uncertainty, (Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence. Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, AAAI 2007 (2007)), 1013-1018
[25] Hogg, C.; Kuter, U.; Muñoz-Avila, H., Learning hierarchical task networks for nondeterministic planning domains, (Proceedings of IJCAI (2009)), 1708-1714
[26] Hogg, C.; Kuter, U.; Munoz-Avila, H., Learning methods to generate good plans: integrating htn learning and reinforcement learning, (Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence. Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI-10 (2010))
[27] Hogg, C.; Muñoz-Avila, H.; Kuter, U., HTN-MAKER: learning HTNs with minimal additional knowledge engineering required, (Proceedings of AAAI (2008)), 950-956
[28] Holmes, M. P.; Isbell, C. L., Schema learning: experience-based construction of predictive action models, (NIPS-04. NIPS-04, Advances in Neural Information Processing Systems, vol. 17 (2004))
[29] Iba, G. A., A heuristic approach to the discovery of macro-operators, Mach. Learn., 3, 285-317 (1989)
[30] Ilghami, O.; Muñoz-Avila, H.; Nau, D. S.; Aha, D. W., Learning approximate preconditions for methods in hierarchical plans, (Proceedings of ICML (2005)), 337-344
[31] Korf, R. E., Macro-operators: a weak method for learning, Artif. Intell., 26, 35-77 (1985) · Zbl 0562.68071
[32] Kuter, U.; Nau, D.; Pistore, M.; Traverso, P., Task decomposition on abstract states, for planning under nondeterminism, Artif. Intell., 173, 669-695 (2009) · Zbl 1191.68636
[33] Kuter, U.; Sirin, E.; Parsia, B.; Nau, D.; Hendler, J., Information gathering during planning for web service composition, J. Web Semant., 3, 2-3, 183-205 (2005)
[34] Lau, T.; Domingos, P.; Weld, D. S., Version space algebra and its application to programming by demonstration, (Proceedings of the Seventeenth International Conference on Machine Learning. Proceedings of the Seventeenth International Conference on Machine Learning, ICML-00 (2000)), 527-534
[35] Li, C. M.; Manya, F.; Mohamedou, N.; Planes, J., Exploiting cycle structures in Max-SAT, (Proceedings of 12th international conference on the Theory and Applications of Satisfiability Testing. Proceedings of 12th international conference on the Theory and Applications of Satisfiability Testing, SAT-09 (2009)), 467-480 · Zbl 1247.68256
[36] Li, C. M.; Manya, F.; Planes, J., Detecting disjoint inconsistent subformulas for computing lower bounds for Max-SAT, (Proceedings of the 21st National Conference on Artificial Intelligence. Proceedings of the 21st National Conference on Artificial Intelligence, AAAI-06 (2006)), 86-91
[37] Li, C. M.; Manya, F.; Planes, J., New inference rules for Max-SAT, J. Artif. Intell. Res., 30, 321-359 (October 2007)
[38] Li, N.; Kambhampati, S.; Yoon, S. W., Learning probabilistic hierarchical task networks to capture user preferences, (Proceedings of IJCAI (2009)), 754-759
[39] Marthi, B.; Russell, S.; Wolfe, J., Angelic hierarchical planning: optimal and online algorithms, (Proceedings of ICAPS (2008)), 222-231
[40] McCluskey, T. L.; Liu, D.; Simpson, R. M., GIPO II: HTN planning in a tool-supported knowledge engineering environment, (Proceedings of ICAPS (2003)), 92-101
[41] Minton, S., Quantitative results concerning the utility of explanation-based learning, Artif. Intell., 42, 2, 363-391 (1990)
[42] Minton, S.; Carbonell, J. G.; Knoblock, C. A.; Kuokka, D. R.; Etzioni, O.; Gil, Y., Explanation-based learning: a problem solving perspective, Artif. Intell., 40, 1, 63-118 (1989)
[43] Muggleton, S.; Raedt, L. D., Inductive logic programming: theory and methods, J. Log. Program., 19/20, 629-679 (1994) · Zbl 0816.68043
[44] Nance, M.; Vogel, A.; Amir, E., Reasoning about partially observed actions, (Proceedings of the Twenty-First National Conference on Artificial Intelligence. Proceedings of the Twenty-First National Conference on Artificial Intelligence, AAAI-06 (2006)), 888-893
[45] Nau, D. S.; Au, T.; Ilghami, O.; Kuter, U.; Muñoz-Avila, H.; Murdock, J. W.; Wu, D.; Yaman, F., Applications of SHOP and SHOP2, IEEE Intell. Syst., 20, 34-41 (2005)
[46] Nau, D. S.; Cao, Y.; Lotem, A.; Muñoz-Avila, H., SHOP: simple hierarchical ordered planner, (Proceedings of IJCAI (1999)), 968-973
[47] Nejati, N.; Langley, P.; Konik, T., Learning hierarchical task networks by observation, (Proceedings of ICML (2006)), 665-672
[48] Newton, M. H.; Levine, J.; Fox, M.; Long, D., Learning macro-actions for arbitrary planners and domains, (Proceedings of ICAPS (2007)), 256-263
[49] Oates, T.; Cohen, P. R., Searching for planning operators with context-dependent and probabilistic effects, (Proceedings of the Thirteenth National Conference on Artificial Intelligence. Proceedings of the Thirteenth National Conference on Artificial Intelligence, AAAI-96 (1996)), 865-868
[50] Pasula, H. M.; Zettlemoyer, L. S.; Kaelbling, L. P., Learning probabilistic relational planning rules, (Proceedings of the Fourteenth International Conference on Automated Planning and Scheduling. Proceedings of the Fourteenth International Conference on Automated Planning and Scheduling, ICAPS-04 (2004)), 73-82
[51] Pasula, H. M.; Zettlemoyer, L. S.; Kaelbling, L. P., Learning symbolic models of stochastic domains, J. Artif. Intell. Res., 29, 309-352 (2007) · Zbl 1182.68181
[52] Reddy, C.; Tadepalli, P., Learning goal-decomposition rules using exercises, (Proceedings of ICML (1997)), 278-286
[53] Simpson, R. M.; McCluskey, T. L.; Zhao, W.; Aylett, R. S.; Doniat, C., GIPO: an integrated graphical tool to support knowledge engineering in AI planning, (Proceedings of the European Conference on Planning (2001))
[54] Sablon, G.; Bruynooghe, M., Using the event calculus to integrate planning and learning in an intelligent autonomous agent, (Current Trends in AI Planning (1994)), 254-265
[55] Schmill, M. D.; Oates, T.; Cohen, P. R., Learning planning operators in real-world, partially observable environments, (Proceedings of the Fifth International Conference on Artificial Intelligence Planning Systems. Proceedings of the Fifth International Conference on Artificial Intelligence Planning Systems, AIPS-00 (2000)), 246-253
[56] Shahaf, D.; Amir, E., Learning partially observable action schemas, (Proceedings of the Twenty-First National Conference on Artificial Intelligence. Proceedings of the Twenty-First National Conference on Artificial Intelligence, AAAI-06 (2006)), 913-919
[57] Shahaf, D.; Chang, A.; Amir, E., Learning partially observable action models: efficient algorithms, (Proceedings of the Twenty-First National Conference on Artificial Intelligence. Proceedings of the Twenty-First National Conference on Artificial Intelligence, AAAI-06 (2006)), 920-926
[58] Simpson, R. M.; Kitchin, D. E.; McCluskey, T. L., Planning domain definition using GIPO, Knowl. Eng. Rev., 22, 2, 117-134 (2007)
[59] van Rijsbergen, C. J., Information Retrieval (1979), Butterworth-Heinemann · Zbl 0227.68052
[60] Walsh, T. J.; Littman, M. L., Efficient learning of action schemas and web-service descriptions, (Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (AAAI 2008) (2008)), 714-719
[61] Wang, X., Learning by observation and practice: an incremental approach for planning operator acquisition, (Proceedings of the Twelfth International Conference on Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, ICML-95 (1995)), 549-557
[62] Wickler, G.; Potter, S.; Tate, A.; Piechouicek, M.; Semsch, E., Planning and choosing: augmenting htn-based agents with mental attitudes, (IAT (2007)), 222-228
[63] Wilkins, D. E., Can ai planners solve practical problems?, Comput. Intell., 6, 232-246 (1990)
[64] Winner, E.; Veloso, M., Analyzing plans with condition effects, (Proceedings of the Sixth International Conference on AI Planning and Scheduling. Proceedings of the Sixth International Conference on AI Planning and Scheduling, AIPS-02 (2002))
[65] Wu, D.; Parsia, B.; Sirin, E.; Hendler, J.; Nau, D.; Nau, D., Automating DAML-S web services composition using SHOP2, (Proceedings of 2nd International Semantic Web Conference. Proceedings of 2nd International Semantic Web Conference, ISWC2003 (2003))
[66] Xu, K.; Muñoz-Avila, H., A domain-independent system for case-based task decomposition without domain theories, (Proceedings of AAAI (2005)), 234-240
[67] Yang, Q., Formalizing planning knowledge for hierarchical planning, Comput. Intell., 6, 2, 12-24 (1990)
[68] Yang, Q.; Wu, K.; Jiang, Y., Learning actions models from plan examples with incomplete knowledge, (Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling. Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling, ICAPS-05 (2005)), 241-250
[69] Yang, Q.; Wu, K.; Jiang, Y., Learning action models from plan examples using weighted MAX-SAT, Artif. Intell., 171, 107-143 (February 2007)
[70] Zhuo, H. H.; Hu, D. H.; Hogg, C.; Yang, Q.; Muñoz-Avila, H., Learning HTN method preconditions and action models from partial observations, (Proceedings of IJCAI (2009)), 1804-1810
[71] Zhuo, H. H.; Kambhampati, S., Action-model acquisition from noisy plan traces, (Proceedings of IJCAI (2013)), 2444-2450
[72] Zhuo, H. H.; Muñoz-Avila, H.; Yang, Q., Learning action models for multi-agent planning, (Proceedings of AAMAS (2011)), 217-224
[73] Zhuo, H. H.; Yang, Q., Action-model acquisition for planning via transfer learning, Artif. Intell., 212, 80-103 (2014), (in this volume) · Zbl 1308.68108
[74] Zhuo, H. H.; Yang, Q.; Hu, D. H.; Li, L., Learning complex action models with quantifiers and logical implications, Artif. Intell., 174, 18, 1540-1569 (2010)
[75] Zhuo, H. H.; Yang, Q.; Pan, R.; Li, L., Cross-domain action-model acquisition for planning via web search, (Proceedings of ICAPS (2011)), 298-305
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.