×

kProbLog: an algebraic Prolog for machine learning. (English) Zbl 1457.68237

Summary: We introduce kProbLog as a declarative logical language for machine learning. kProbLog is a simple algebraic extension of Prolog with facts and rules annotated by semi-ring labels. It allows to elegantly combine algebraic expressions with logic programs. We introduce the semantics of kProbLog, its inference algorithm, its implementation and provide convergence guarantees. We provide several code examples to illustrate its potential for a wide range of machine learning techniques. In particular, we show the encodings of state-of-the-art graph kernels such as Weisfeiler-Lehman graph kernels, propagation kernels and an instance of graph invariant kernels, a recent framework for graph kernels with continuous attributes. However, kProbLog is not limited to kernel methods and it can concisely express declarative formulations of tensor-based algorithms such as matrix factorization and energy-based models, and it can exploit semirings of dual numbers to perform algorithmic differentiation. Furthermore, experiments show that kProbLog is not only of theoretical interest, but can also be applied to real-world datasets. At the technical level, kProbLog extends aProbLog (an algebraic Prolog) by allowing multiple semirings to coexist in a single program and by introducing meta-functions for manipulating algebraic values.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68N17 Logic programming
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving G, Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens J, Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke M, Yu, Y., & Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. http://tensorflow.org/, software available from tensorflow.org.
[2] Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow IJ, Bergeron, A., Bouchard, N., & Bengio, Y. (2012). Theano: New features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop.
[3] Baydin, A. G., Pearlmutter, B. A., Radul, A. A., & Siskind, J. M. (2015). Automatic differentiation in machine learning: A survey. arXiv:1502.05767
[4] Bryant, RE, Symbolic Boolean manipulation with ordered binary-decision diagrams, ACM Computing Surveys, 24, 293-318, (1992)
[5] Ceri, S; Gottlob, G; Tanca, L, What you always wanted to know about Datalog (and never dared to ask), IEEE Transactions on Knowledge and Data Engineering, 1, 146-166, (1989)
[6] Collobert, R., Bengio, S., & Mariéthoz, J. (2002). Torch: A modular machine learning software library. IDIAP: Tech. rep.
[7] Costa, F., & De Grave, K. (2010). Fast neighborhood subgraph pairwise distance kernel. In Proceedings of the 27th international conference on machine learning (ICML-10), June 21-24, Haifa, Israel, pp. 255-262, http://www.icml2010.org/papers/347.pdf
[8] Darwiche, A. (2011). SDD: A new canonical representation of propositional knowledge bases. In IJCAI 2011, Proceedings of the 22nd international joint conference on artificial intelligence, Barcelona, Catalonia, Spain, July 16-22, pp. 819-826. doi:10.5591/978-1-57735-516-8/IJCAI11-143
[9] Darwiche, A; Marquis, P, A knowledge compilation map, Journal of Artificial Intelligence Research, 17, 229-264, (2002) · Zbl 1045.68131
[10] De Marneffe, M. C., & Manning, C. D. (2008). The stanford typed dependencies representation. InColing 2008: Proceedings of the workshop on cross-framework and cross-domain parser evaluation, Association for Computational Linguistics, pp. 1-8. · Zbl 0339.68004
[11] De Raedt, L. (2008). Logical and relational learning. Berlin: Springer. · Zbl 1203.68145
[12] De Raedt, L., Kimmig, A., Toivonen, H. (2007). Problog: A probabilistic prolog and its application in link discovery. In IJCAI 2007, proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, January 6-12, pp. 2462-2467, http://ijcai.org/Proceedings/07/Papers/396.pdf · Zbl 1079.68086
[13] Raedt, L; Kersting, K; Natarajan, S; Poole, D, Statistical relational artificial intelligence: logic, probability, and computation, Synthesis Lectures on Artificial Intelligence and Machine Learning, 10, 1-189, (2016) · Zbl 1352.68005
[14] Debnath, AK; Lopez de Compadre, RL; Debnath, G; Shusterman, AJ; Hansch, C, Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity, Journal of Medicinal Chemistry, 34, 786-797, (1991)
[15] Droste, M., & Kuich, W. (2009). Semirings and formal power series. Berlin: Springer.
[16] Eisner, J. (2002). Parameter estimation for probabilistic finite-state transducers. In Proceedings of the 40th annual meeting on association for computational linguistics, Association for Computational Linguistics, pp. 1-8.
[17] Eisner, J., Blatz, J. (2007). Program transformations for optimization of parsing algorithms and other weighted logic programs. In Proceedings of formal grammar, pp. 45-85.
[18] Eisner, J., & Filardo, N. W. (2011). Dyna: Extending datalog for modern AI. In Datalog reloaded. Springer, pp. 181-220.
[19] Eisner, J., Goldlust, E., & Smith, N. A. (2004). Dyna: A declarative language for implementing dynamic programs. In Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL), Companion Volume, Barcelona, pp. 218-221.
[20] Esparza, J., Luttenberger, M., & Schlund, M. (2014). Fpsolve: A generic solver for fixpoint equations over semirings. In Proceedings of 19th international conference implementation and application of automata, CIAA 2014, Giessen, Germany, July 30-August 2, pp. 1-15. doi:10.1007/978-3-319-08846-4_1. · Zbl 1302.68330
[21] Frasconi, P; Costa, F; Raedt, L; Grave, K, Klog: A language for logical and relational learning with kernels, Artificial Intelligence, 217, 117-143, (2014) · Zbl 1405.68288
[22] Garcez, Ad., Besold, T. R., de Raedt, L., Földiak, P., Hitzler, P., Icard, T., Kühnberger, K. U., Lamb, L. C., Miikkulainen, R., & Silver, D. L. (2015). Neural-symbolic learning and reasoning: contributions and challenges. In Proceedings of the AAAI spring symposium on knowledge representation and reasoning: Iintegrating symbolic and neural approaches, Stanford.
[23] Garcez, A. S., Lamb, L. C., & Gabbay, D. M. (2008). Neural-symbolic cognitive reasoning. Berlin: Springer.
[24] Gärtner, T., Flach, P., & Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. In Learning Theory and Kernel Machines. Springer, pp. 129-143 · Zbl 1274.68312
[25] Gärtner, T; Lloyd, JW; Flach, PA, Kernels and distances for structured data, Machine Learning, 57, 205-232, (2004) · Zbl 1079.68086
[26] Getoor, L., & Taskar, B. (Eds.). (2007). Introduction to statistical relational learning. Adaptive computation and machine learning. Cambridge, MA: MIT Press. · Zbl 1141.68054
[27] Golub, G. H., & Van Loan, C. F. (2012). Matrix computations (Vol. 3). Baltimore: JHU Press.
[28] Green, T. J., Karvounarakis, G., & Tannen, V. (2007). Provenance semirings. In Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM.
[29] Griewank, A., & Walther, A. (2008). Evaluating derivatives: Principles and techniques of algorithmic differentiation (2nd ed.). Philadelphia: Society for Industrial and Applied Mathematics. · Zbl 1159.65026
[30] Kashima, H; Tsuda, K; Inokuchi, A, Marginalized kernels between labeled graphs, ICML, 3, 321-328, (2003)
[31] Kazius, J., McGuire, R., & Bursi, R. (2005). Derivation and validation of toxicophores for mutagenicity prediction. Journal of Medicinal Chemistry, 48(1), 312-320. · Zbl 1352.68005
[32] Kim, M., & Candan, K. S. (2011). Approximate tensor decomposition within a tensor-relational algebraic framework. In Proceedings of the 20th ACM conference on information and knowledge management, CIKM 2011, Glasgow, United Kingdom, October 24-28, pp. 1737-1742. doi:10.1145/2063576.2063827 · Zbl 1405.68288
[33] Kimmig, A., Van den Broeck, G., & De Raedt, L. (2011). An algebraic prolog for reasoning about possible worlds. In Proceedings of the twenty-fifth AAAI conference on artificial intelligence, AAAI 2011, San Francisco, California, USA, August 7-11, http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3685.
[34] Kimmig, A., Van den Broeck, G., & De Raedt, L. (2012). Algebraic model counting. CoRR abs/1211.4475, arXiv:1211.4475 · Zbl 1436.68335
[35] Kisa, D., Van den Broeck, G., Choi, A., & Darwiche, A. (2014). Probabilistic sentential decision diagrams. In Proceedings of the 14th international conference on principles of knowledge representation and reasoning (KR). · Zbl 1390.68640
[36] Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30-37.
[37] Kuich, W. (1997). Semirings and formal power series: Their relevance to formal languages and automata (pp. 609-677). In Handbook of formal languages: Springer.
[38] Landwehr, N., Passerini, A., De Raedt, L., & Frasconi, P. (2006). kfoil: Learning simple relational kernels, pp. 389-394.
[39] Li, X., & Roth, D. (2002). Learning question classifiers. In Proceedings of the 19th international conference on computational linguistics-Volume 1, Association for Computational Linguistics.
[40] Mahé, P., Ueda, N., Akutsu, T., Perret, J. L., & Vert, J. P. (2004). Extensions of marginalized graph kernels. In Proceedings of the twenty-first international conference on machine learning, ACM, p. 70.
[41] Milch, B., Marthi, B., Russell, S. J., Sontag, D., Ong, D. L., & Kolobov, A. (2005). BLOG: Probabilistic models with unknown objects, pp. 1352-1359, http://ijcai.org/Proceedings/05/Papers/1546.pdf.
[42] Muggleton, S; Raedt, LD; Poole, D; Bratko, I; Flach, PA; Inoue, K; etal., ILP turns 20-biography and future challenges, Machine Learning, 86, 3-23, (2012) · Zbl 1243.68014
[43] Neumann, M., Patricia, N., Garnett, R., & Kersting, K. (2012). Efficient graph kernels by randomization. In Proceedings of machine learning and knowledge discovery in databases—European conference, ECML PKDD 2012, Bristol, UK, September 24-28. Part I, pp. 378-393. doi:10.1007/978-3-642-33460-3_30.
[44] Nickel, M., Tresp, V., & Kriegel, H. P. (2011). A three-way model for collective learning on multi-relational data. In Proceedings of the 28th international conference on machine learning (ICML-11), pp. 809-816. · Zbl 1243.68014
[45] Nilsson, U., & Maluszynski, J. (1990). Logic, programming and Prolog. Chichester: Wiley. · Zbl 0722.68023
[46] Orsini, F., Frasconi, P., & De Raedt, L. (2015). Graph invariant kernels. In Proceedings of the twenty-fourth international joint conference on artificial intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, pp. 3756-3762, http://ijcai.org/Abstract/15/528.
[47] Quinlan, JR, Learning logical definitions from relations, Machine Learning, 5, 239-266, (1990)
[48] Richardson, M; Domingos, PM, Markov logic networks, Machine Learning, 62, 107-136, (2006)
[49] Sammut, C. (1993). The origins of inductive logic programming: A prehistoric tale. In Proceedings of the 3rd international workshop on inductive logic programming. J Stefan Institute, pp. 127-147.
[50] Sato, T. (1995). A statistical learning method for logic programs with distribution semantics. In Logic programming, proceedings of the twelfth international conference on logic programming, Tokyo, Japan, June 13-16, pp. 715-729.
[51] Sato, T., & Kameya, Y. (1997). PRISM: A language for symbolic-statistical modeling. In Proceedings of the fifteenth international joint conference on artificial intelligence, IJCAI 97, Nagoya, Japan, August 23-29, Vol. 2, pp. 1330-1339. http://ijcai.org/Proceedings/97-2/Papers/078.pdf.
[52] Shervashidze, N., Schweitzer, P., van Leeuwen, E. J., Mehlhorn, K., & Borgwardt, K. M. (2011). Weisfeiler-lehman graph kernels. Journal of Machine Learning Research 12:2539-2561, http://dl.acm.org/citation.cfm?id=2078187. · Zbl 1280.68194
[53] Emden, MH; Kowalski, RA, The semantics of predicate logic as a programming language, Journal of the ACM, 23, 733-742, (1976) · Zbl 0339.68004
[54] Van Laer, W., & De Raedt, L. (2001). How to upgrade propositional learners to first order logic: A case study. In Machine learning and its applications. Springer, pp. 102-126. · Zbl 0980.68700
[55] Vishwanathan, SVN; Schraudolph, NN; Kondor, R; Borgwardt, KM, Graph kernels, The Journal of Machine Learning Research, 11, 1201-1242, (2010) · Zbl 1242.05112
[56] Vlasselaer, J., Van den Broeck, G., Kimmig, A., Meert, W., & De Raedt, L. (2015). Anytime inference in probabilistic logic programs with tp-compilation. In Proceedings of the twenty-fourth international joint conference on artificial intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, pp. 1852-1858, http://ijcai.org/Abstract/15/263. · Zbl 1386.68174
[57] Whaley, J., Avots, D., Carbin, M., & Lam, M. S. (2005). Using datalog with binary decision diagrams for program analysis. In Programming languages and systems: Springer. · Zbl 1159.68386
[58] Zhang, D., & Lee, W. S. (2003). Question classification using support vector machines. In SIGIR 2003: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, July 28-August 1, 2003, Toronto, Canada, pp. 26-32. doi:10.1145/860435.860443. · Zbl 1242.05112
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.