×

zbMATH — the first resource for mathematics

Lifted discriminative learning of probabilistic logic programs. (English) Zbl 07073621
Summary: Probabilistic logic programming (PLP) provides a powerful tool for reasoning with uncertain relational models. However, learning probabilistic logic programs is expensive due to the high cost of inference. Among the proposals to overcome this problem, one of the most promising is lifted inference. In this paper we consider PLP models that are amenable to lifted inference and present an algorithm for performing parameter and structure learning of these models from positive and negative examples. We discuss parameter learning with EM and LBFGS and structure learning with LIFTCOVER, an algorithm similar to SLIPCOVER. The results of the comparison of LIFTCOVER with SLIPCOVER on 12 datasets show that it can achieve solutions of similar or better quality in a fraction of the time.

MSC:
68T05 Learning and adaptive systems in artificial intelligence
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Alberti, M.; Bellodi, E.; Cota, G.; Riguzzi, F.; Zese, R., cplint on SWISH: Probabilistic logical inference with a web browser, Intelligenza Artificiale, 11, 47-64, (2017)
[2] Bellodi, E., & Riguzzi, F. (2012). Learning the structure of probabilistic logic programs. In S. Muggleton, A. Tamaddoni-Nezhad, & F. Lisi (Eds.) 22nd international conference on inductive logic programming, Vol. 7207, LNCS. Berlin: Springer, pp 61-75. · Zbl 1379.68269
[3] Bellodi, E.; Riguzzi, F., Expectation maximization over binary decision diagrams for probabilistic logic programs, Intelligent Data Analysis, 17, 343-363, (2013)
[4] Bellodi, E.; Riguzzi, F., Structure learning of probabilistic logic programs by searching the clause space, Theory and Practice of Logic Programming, 15, 169-212, (2015) · Zbl 1379.68269
[5] Bellodi, E.; Lamma, E.; Riguzzi, F.; Costa, VS; Zese, R., Lifted variable elimination for probabilistic logic programming, Theory and Practice of Logic Programming, 14, 681-695, (2014) · Zbl 1309.68027
[6] Darwiche, A.; Marquis, P., A knowledge compilation map, Journal of Artificial Intelligence Research, 17, 229-264, (2002) · Zbl 1045.68131
[7] Davis, J., & Goadrich, M. (2006). The relationship between precision-recall and ROC curves. In European conference on machine learning (ECML 2006), ACM, pp. 233-240.
[8] Raedt, L.; Kimmig, A., Probabilistic (logic) programming concepts, Machine Learning, 100, 5-47, (2015) · Zbl 1346.68050
[9] De Raedt, L., Kimmig, A., & Toivonen, H. (2007). ProbLog: A probabilistic prolog and its application in link discovery. In M.M. Veloso (Ed.), 20th international joint conference on artificial intelligence (IJCAI 2007), Vol. 7. AAAI Press/IJCAI, pp 2462-2467.
[10] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society Series B (Methodological), 39, 1-38, (1977) · Zbl 0364.62022
[11] Mauro, N.; Bellodi, E.; Riguzzi, F., Bandit-based Monte-Carlo structure learning of probabilistic logic programs, Machine Learning, 100, 127-156, (2015) · Zbl 1346.68051
[12] Fawcett, T., An introduction to ROC analysis, Pattern Recognition Letters, 27, 861-874, (2006)
[13] Good, IJ, A causal calculus (I), The British journal for the philosophy of science, 11, 305-318, (1961)
[14] Gordon, DM, A survey of fast exponentiation methods, Journal of Algorithms, 27, 129-146, (1998) · Zbl 0915.11064
[15] Gorlin, A.; Ramakrishnan, CR; Smolka, SA, Model checking with probabilistic tabled logic programming, Theory and Practice of Logic Programming, 12, 681-700, (2012) · Zbl 1260.68062
[16] Hoos, H. H., & Stützle, T. (2004). Stochastic local search: Foundations and applications. New York: Elsevier/Morgan Kaufmann. · Zbl 1126.68032
[17] Huynh, T. N., & Mooney, R. J. (2008). Discriminative structure and parameter learning for Markov logic networks. In W.W. Cohen, A. McCallum, & S.T. Roweis (Eds.), Proceedings of the 25th international conference on machine learning, ACM, pp. 416-423.
[18] Huynh, T. N., & Mooney, R. J. (2011). Online structure learning for Markov logic networks. In D. Gunopulos, T. Hofmann, D. Malerba, & M. Vazirgiannis (Eds.), European conference on machine learning and principles and practice of knowledge discovery in databases (ECMLPKDD 2011), Vol. 6912. Lecture Notes in Computer Science. Springer, pp. 81-96. https://doi.org/10.1007/978-3-642-23783-6_6.
[19] Kazemi, S. M., Buchman, D., Kersting, K., Natarajan, S., & Poole, D. (2014). Relational logistic regression. In C. Baral, G. D. Giacomo, & T. Eiter (Eds.), 14th international conference on principles of knowledge representation and reasoning (KR 2014). AAAI Press.
[20] Kersting, K., & De Raedt, L. (2002). Basic principles of learning Bayesian logic programs. In Institute for Computer Science, University of Freiburg, Citeseer. · Zbl 1137.68544
[21] Khot, T., Natarajan, S., Kersting, K., & Shavlik, J. W. (2011). Learning Markov logic networks via functional gradient boosting. In Proceedings of the 11th IEEE international conference on data mining, IEEE, pp. 320-329.
[22] Kietz, J., & Lübbe, M. (1994). An efficient subsumption algorithm for inductive logic programming. In W.W. Cohen, & H. Hirsh (Eds.), 11th international conference on machine learning, Morgan Kaufmann, pp. 130-138.
[23] Kisynski, J., & Poole, D. (2009). Lifted aggregation in directed first-order probabilistic models. In C. Boutilier (Ed.), 21st international joint conference on artificial intelligence (IJCAI 2009), pp. 1922-1929.
[24] Kok, S., & Domingos, P. (2005). Learning the structure of Markov logic networks. In 22nd international conference on machine learning, ACM, pp. 441-448.
[25] Kok, S., & Domingos, P. (2010). Learning Markov logic networks using structural motifs. In J. Fürnkranz, & T. Joachims (Eds.), 27th international conference on machine learning, Omnipress, pp. 551-558.
[26] Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. Cambridge, MA: MIT Press. · Zbl 1183.68483
[27] Koller, D., & Pfeffer, A. (1997). Learning probabilities for noisy first-order rules. In IJCAI, pp. 1316-1323.
[28] Meert, W.; Struyf, J.; Blockeel, H., Learning ground CP-Logic theories by leveraging Bayesian network learning techniques, Fundamenta Informaticae, 89, 131-160, (2008) · Zbl 1155.68493
[29] Meert, W., Struyf, J., & Blockeel, H. (2010). CP-Logic theory inference with contextual variable elimination and comparison to BDD based inference methods. In L. De Raedt (Ed.), Inductive logic programming, 19th international conference (ILP 2009), Vol. 5989, Lecture notes in computer science. Springer, pp. 96-109. https://doi.org/10.1007/978-3-642-13840-9_10. · Zbl 1286.68421
[30] Mihalkova, L., & Mooney, R. J. (2007). Bottom-up learning of Markov logic network structure. In Proceedings of the 24th international conference on machine learning, ACM, pp. 625-632.
[31] Mørk, S.; Holmes, I., Evaluating bacterial gene-finding hmm structures as probabilistic logic programs, Bioinformatics, 28, 636-642, (2012)
[32] Muggleton, S., Inverse entailment and Progol, New Generation Computing, 13, 245-286, (1995)
[33] Natarajan, S., Tadepalli, P., Kunapuli, G., & Shavlik, J. (2009). Learning parameters for relational probabilistic models with noisy-or combining rule. In International conference on machine learning and applications, 2009. ICMLA’09. IEEE, pp. 141-146.
[34] Natarajan, S.; Khot, T.; Kersting, K.; Gutmann, B.; Shavlik, J., Gradient-based boosting for statistical relational learning: The relational dependency network case, Machine Learning, 86, 25-56, (2012) · Zbl 1243.68245
[35] Nath, A., & Domingos, P. (2015). Learning relational sum-product networks. In B. Bonet & S. Koenig (Eds.), 29th national conference on artificial intelligence, AAAI’15 (pp. 2878-2886). Austin, Texas, USA: AAAI Press.
[36] Nguembang Fadja, A.; Riguzzi, F.; Holzinger, A. (ed.); Goebel, R. (ed.); Ferri, M. (ed.); Palade, V. (ed.), Probabilistic logic programming in action, No. 10344, (2017), Berlin
[37] Nishino, M., Yamamoto, A., & Nagata, M. (2014). A sparse parameter learning method for probabilistic logic programs. In Statistical relational artificial intelligence, Vol. WS-14-13. Papers from the 2014 AAAI workshop, AAAI Press, AAAI Workshops.
[38] Nocedal, J., Updating quasi-Newton matrices with limited storage, Mathematics of Computation, 35, 773-782, (1980) · Zbl 0464.65037
[39] Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. Burlington: Morgan Kaufmann. · Zbl 0746.68089
[40] Poole, D., The independent choice logic for modelling multiple agents under uncertainty, Artificial Intelligence, 94, 7-56, (1997) · Zbl 0902.03017
[41] Poole, D., Abducing through negation as failure: Stable models within the independent choice logic, Journal of Logic Programming, 44, 5-35, (2000) · Zbl 0957.68013
[42] Poole, D., First-order probabilistic inference, IJCAI, 3, 985-991, (2003)
[43] Quinlan, JR, Learning logical definitions from relations, Machine Learning, 5, 239-266, (1990)
[44] Reutemann, P., Pfahringer, B., & Frank, E. (2004). A toolbox for learning from relational data with propositional and multi-instance learners. In G.I. Webb, & X. Yu (Eds.), 17th Australian joint conference on artificial intelligence (AI 1994), Vol. 3339. Lecture notes in computer science, Springer, pp. 1017-1023. https://doi.org/10.1007/978-3-540-30549-1_95.
[45] Riguzzi, F., Speeding up inference for probabilistic logic programs, The Computer Journal, 57, 347-363, (2014)
[46] Riguzzi, F., The distribution semantics for normal programs with function symbols, International Journal of Approximate Reasoning, 77, 1-19, (2016) · Zbl 1385.68022
[47] Riguzzi, F.; Swift, T., The PITA system: Tabling and answer subsumption for reasoning under uncertainty, Theory and Practice of Logic Programming, 11, 433-449, (2011) · Zbl 1218.68169
[48] Riguzzi, F.; Swift, T.; Kifer, M. (ed.); Liu, YA (ed.), Probabilistic logic programming under the distribution semantics, (2018), Bonita Springs
[49] Riguzzi, F.; Bellodi, E.; Lamma, E.; Zese, R.; Cota, G., Probabilistic logic programming on the web, Software: Practice and Experience, 46, 1381-1396, (2016) · Zbl 1347.68320
[50] Riguzzi, F.; Bellodi, E.; Zese, R.; Cota, G.; Lamma, E., A survey of lifted inference approaches for probabilistic logic programming under the distribution semantics, International Journal of Approximate Reasoning, 80, 313-333, (2017) · Zbl 1400.68226
[51] Riguzzi, F., Lamma, E., Alberti, M., Bellodi, E., Zese, R., & Cota, G. (2017b). Probabilistic logic programming for natural language processing. In F. Chesani, P. Mello, & M. Milano (Eds.), Workshop on deep understanding and reasoning, Vol. 1802, URANIA 2016, Sun SITE Central Europe, CEUR workshop proceedings, pp. 30-37. · Zbl 1347.68320
[52] Sato, T. (1995). A statistical learning method for logic programs with distribution semantics. In L. Sterling (Ed.), 12th international conference on logic programming (ICLP 1995), MIT Press, pp. 715-729.
[53] Sato, T., & Kameya, Y. (1997). PRISM: A language for symbolic-statistical modeling. In 15th international joint conference on artificial intelligence (IJCAI 1997), Vol. 97, pp 1330-1339.
[54] Sato, T.; Kubota, K., Viterbi training in PRISM, Theory and Practice of Logic Programming, 15, 147-168, (2015) · Zbl 1379.68272
[55] Schulte, O.; Khosravi, H., Learning graphical models for relational data via lattice search, Machine Learning, 88, 331-368, (2012)
[56] Srinivasan, A. (2007). The aleph manual. http://www.cs.ox.ac.uk/activities/machlearn/Aleph/aleph.html. Accessed April 3, 2018.
[57] Srinivasan, A.; Muggleton, S.; Sternberg, MJE; King, RD, Theories for mutagenicity: A study in first-order and feature-based induction, Artificial Intelligence, 85, 277-299, (1996)
[58] Srinivasan, A., King, R. D., Muggleton, S., & Sternberg, M. J. E. (1997). Carcinogenesis predictions using ILP. In N. Lavrac, & S. Dzeroski (Eds.), 7th international workshop on inductive logic programming, Vol. 1297. Lecture notes in computer science, Berlin: Springer, pp 273-287.
[59] Struyf, J., Davis, J., & Page, D. (2006). An efficient approximation to lookahead in relational learners. In European conference on machine learning (ECML 2006), Lecture notes in computer science. Springer, pp. 775-782. https://doi.org/10.1007/11871842_79.
[60] Taghipour, N.; Fierens, D.; Davis, J.; Blockeel, H., Lifted variable elimination: Decoupling the operators from the constraint language, Journal of Artificial Intelligence Research, 47, 393-439, (2013) · Zbl 1272.68397
[61] Valiant, LG, The complexity of enumeration and reliability problems, SIAM Journal on Computing, 8, 410-421, (1979) · Zbl 0419.68082
[62] Van den Broeck, G., Meert, W., & Darwiche, A. (2014). Skolemization for weighted first-order model counting. In C. Baral, G.D. Giacomo, & T. Eiter (Eds.), 14th international conference on principles of knowledge representation and reasoning (KR 2014), AAAI Press, pp. 111-120.
[63] Gelder, A.; Ross, KA; Schlipf, JS, The well-founded semantics for general logic programs, Journal of the ACM, 38, 620-650, (1991) · Zbl 0799.68045
[64] Haaren, J.; Broeck, G.; Meert, W.; Davis, J., Lifted generative learning of markov logic networks, Machine Learning, 103, 27-55, (2016) · Zbl 1357.68187
[65] Vennekens, J., & Verbaeten, S. (2003). Logic programs with annotated disjunctions. Tech. Rep. CW386, KU Leuven.
[66] Vennekens, J., Verbaeten, S., & Bruynooghe, M. (2004a). Logic programs with annotated disjunctions. In B. Demoen, & V. Lifschitz (Eds.), 24th international conference on logic programming (ICLP 2004), Vol. 3131. Lecture notes in computer science, Berlin: Springer, pp. 195-209. · Zbl 1104.68391
[67] Vennekens, J., Verbaeten, S., & Bruynooghe, M. (2004b). Logic programs with annotated disjunctions. In 24th international conference on logic programming (ICLP 2004), Vol. 3132. Lecture notes in computer science. Springer, pp. 431-445. · Zbl 1104.68391
[68] Wang, W. Y., Mazaitis, K., & Cohen, W. W. (2014). Structure learning via parameter learning. In J. Li, X.S. Wang, M.N. Garofalakis, I. Soboroff, T. Suel, & M. Wang (Eds.), 23rd ACM international conference on conference on information and knowledge management, CIKM 2014, ACM Press, pp. 1199-1208. https://doi.org/10.1145/2661829.2662022.
[69] Wellman, MP; Breese, JS; Goldman, RP, From knowledge bases to decision models, The Knowledge Engineering Review, 7, 35-53, (1992)
[70] Wielemaker, J.; Schrijvers, T.; Triska, M.; Lager, T., SWI-Prolog, Theory and Practice of Logic Programming, 12, 67-96, (2012) · Zbl 1244.68023
[71] Železný, F., Srinivasan, A., & Page, C. D. (2002). Lattice-search runtime distributions may be heavy-tailed. In Proceedings of the 12th international conference on inductive logic programming. Springer.
[72] Železný, F.; Srinivasan, A.; Page, CD, Randomised restarted search in ILP, Machine Learning, 64, 183-208, (2006) · Zbl 1103.68484
[73] Zhang, N. L., & Poole, D. (1994). A simple approach to Bayesian network computations. In 10th Canadian conference on artificial intelligence, Canadian AI 1994, pp. 171-178.
[74] Zhang, NL; Poole, DL, Exploiting causal independence in Bayesian network inference, Journal of Artificial Intelligence Research, 5, 301-328, (1996) · Zbl 0900.68384
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.