## A linear algebraic approach to Datalog evaluation.(English)Zbl 1379.68082

Summary: We propose a fundamentally new approach to Datalog evaluation. Given a linear Datalog program DB written using $$N$$ constants and binary predicates, we first translate if-and-only-if completions of clauses in DB into a set $$\mathbf{E}_q$$(DB) of matrix equations with a non-linear operation, where relations in $$\mathbf{M}_{\mathrm{DB}}$$, the least Herbrand model of DB, are encoded as adjacency matrices. We then translate $$\mathbf{E}_q$$(DB) into another, but purely linear matrix equations $$\tilde{\mathbf{E}}_q$$(DB). It is proved that the least solution of $$\tilde{\mathbf{E}}_q$$(DB) in the sense of matrix ordering is converted to the least solution of $$\mathbf{E}_q$$(DB) and the latter gives $$\mathbf{M}_{\mathrm{DB}}$$ as a set of adjacency matrices. Hence, computing the least solution of $$\tilde{\mathbf{E}}_q$$(DB) is equivalent to computing $$\mathbf{M}_{\mathrm{DB}}$$ specified by DB. For a class of tail recursive programs and for some other types of programs, our approach achieves $$O(N^3)$$ time complexity irrespective of the number of variables in a clause since only matrix operations costing $$O(N^3)$$ or less are used. We conducted two experiments that compute the least Herbrand models of linear Datalog programs. The first experiment computes transitive closure of artificial data and real network data taken from the Koblenz Network Collection. The second one compared the proposed approach with the state-of-the-art symbolic systems including two Prolog systems and two ASP systems, in terms of computation time for a transitive closure program and the same generation program. In the experiment, it is observed that our linear algebraic approach runs $$10^1\sim 10^4$$ times faster than the symbolic systems when data is not sparse. Our approach is inspired by the emergence of big knowledge graphs and expected to contribute to the realization of rich and scalable logical inference for knowledge graphs.

### MSC:

 68N17 Logic programming

### Keywords:

Datalog; least model; matrix; vector space

### Software:

DLV; XSB; Datalog; SCASY; recsy
Full Text:

### References:

 [1] Alviano, M.; Faber, W.; Leone, N.; Perri, S.; Pfeifer, G.; Terracina, G.; De Moor, O.; Gottlob, G.; Furche, T.; Sellers, A., Datalog Reloaded, LNCS 6702, The disjunctive datalog system DLV, 282-301, (2010), Springer: Springer, Berlin [2] Bartels, R.; Stewart, G., Solution of the matrix equation AX + XB = C, Communication of the ACM, 15, 9, (1972) · Zbl 1372.65121 [3] Bollacker, K.; Evans, C.; Paritosh, P.; Sturge, T.; Taylor, J., (2008) [4] Ceri, S.; Gottlob, G.; Tanca, L., What you always wanted to know about datalog (and never dared to ask), IEEE Transactions on Knowledge and Data Engineering, 1, 1, 146-166, (1989) [5] Cichocki, A.; Zdunek, R.; Phan, A.-H.; Amari, S., Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation, (2009), Chichester, West Sussex, UK: John Wiley & Sons, Ltd., Chichester, West Sussex, UK [6] Coppersmith, D.; Winograd, S., Matrix multiplication via arithmetic progressions, Journal of Symbolic Computation, 9, 3, 251-280, (1990) · Zbl 0702.65046 [7] Dong, X.; Gabrilovich, E.; Heitz, G.; Horn, W.; Lao, N.; Murphy, K.; Strohmann, T.; Sun, S.; Zhang, W., (2014) [8] Gebser, M.; Kaminski, R.; Kaufmann, B.; Schaub, T.; Leuschel, M.; Schrijvers, T., (2014) [9] Golub, G.; Nash, S.; Van Loan, C., A Hessenberg-Schur method for the problem AX + XB = C, IEEE Transion Automated Control AC-24, 24, 909-913, (1979) · Zbl 0421.65022 [10] Granat, R.; Jonsson, I.; Kågström, B., RECSY and SCASY library software: Recursive blocked and parallel algorithms for Sylvester-type matrix equations with some applications, Parallel Scientific Computing and Optimization, 27, 3-24, (2009) · Zbl 1183.68729 [11] Grefenstette, E., (2013) [12] Jonsson, I.; Kågström, B., Recursive blocked algorithms for solving triangular systems - Part II: Two-sided and generalized Sylvester and Lyapunov matrix equations, ACM Transactions on Mathematical Software, 28, 4, 392-415, (2002) · Zbl 1072.65061 [13] Kolda, T. G.; Bader, B. W., Tensor decompositions and applications, SIAM Review, 51, 3, 455-500, (2009) · Zbl 1173.65029 [14] Krompass, D.; Nickel, M.; Tresp, V.; Mika, P.; Tudorache, T.; Bernstein, A.; Welty, C.; Knoblock, C.; Vrandeac, D.; Groth, P.; Noy, N.; Janowicz, K.; Goble, C., (2014) [15] Kunegis, J., (2013) [16] Lin, F., From Satisfiability to Linear Algebra, (2013), Hong Kong University of Science and Technology [17] Lloyd, J., Foundations of Logic Programming, (1993), Springer-Verlag: Springer-Verlag, New York, Inc · Zbl 0807.68001 [18] Nickel, M., (2013) [19] Nickel, M.; Murphy, K.; Tresp, V.; Gabrilovich, E., (2015) [20] Rocktäschel, T.; Bosnjak, M.; Singh, S.; Riedel, S., (2014) [21] Rocktäschel, T.; Singh, S.; Riedel, S., (2015) [22] Saberi, A.; Stoorvogel, A.; Sannuti, P., (2007) [23] Simoncini, V., (2013) [24] Suchanek, F. M.; Kasneci, G.; Weikum, G., (2007) [25] Swift, T.; Warren, D., XSB: Extending prolog with tabled logic programming, Theory and Practice of Logic Programming (TPLP), 12, 1-2, 157-187, (2012) · Zbl 1244.68021 [26] Tarjan, R. E., Depth-first search and linear graph algorithms, SIAM Journal on Computing, 1, 2, 146-160, (1972) · Zbl 0251.05107 [27] Tekle, K. T.; Liu, Y. A., (2010) [28] Warren, D. S., (1999) [29] Yang, B.; Yih, W.; He, X.; Gao, J.; Deng, L., (2015) [30] Zhou, N.-F.; Kameya, Y.; Sato, T., (2010)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.