×

Regular queries on graph databases. (English) Zbl 1375.68047

Summary: Graph databases are currently one of the most popular paradigms for storing data. One of the key conceptual differences between graph and relational databases is the focus on navigational queries that ask whether some nodes are connected by paths satisfying certain restrictions. This focus has driven the definition of several different query languages and the subsequent study of their fundamental properties. We define the graph query language of Regular Queries, which is a natural extension of unions of conjunctive 2-way regular path queries (UC2RPQs) and unions of conjunctive nested 2-way regular path queries (UCN2RPQs). Regular queries allow expressing complex regular patterns between nodes. We formalize regular queries as nonrecursive Datalog programs extended with the transitive closure of binary predicates. This language has been previously considered, but its algorithmic properties are not well understood. Our main contribution is to show elementary tight bounds for the containment problem for regular queries. Specifically, we show that this problem is 2Expspace-complete. For all extensions of regular queries known to date, the containment problem turns out to be non-elementary. Together with the fact that evaluating regular queries is not harder than evaluating UCN2RPQs, our results show that regular queries achieve a good balance between expressiveness and complexity, and constitute a well-behaved class that deserves further investigation.

MSC:

68P15 Database theory
68Q17 Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.)

Software:

GraphLog; XPath; nSPARQL
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Abiteboul, S., Hull, R., Vianu, V.: Foundations of databases Addison-Wesley (1995) · Zbl 0848.68031
[2] Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv 40(1) (2008) · Zbl 1015.68083
[3] Arenas, M., Gutierrez, C., Miranker, D.P., Pérez, J., Sequeda, J.F.: Querying semantic data on the web? ACM SIGMOD Rec 41(4), 6-17 (2013) · doi:10.1145/2430456.2430458
[4] Arenas, M., Pérez, J.: Querying semantic web data with sparql. In: PODS, pp 305-316 (2011)
[5] Barcelo, P., Libkin, L., Lin, A.W., Wood, P.T.: Expressive languages for path queries over graph-structured data. ACM TODS 37(4), 31 (2012) · doi:10.1145/2389241.2389250
[6] Barceló, P., Pérez, J., Reutter, J.L.: Relative expressiveness of nested regular expressions. In: AMW, pp 180-195 (2012) · Zbl 0456.68123
[7] Barcelo, P., Romero, M., Vardi, M.Y.: Semantic acyclicity on graph databases. In: PODS, pp 237-248 (2013) · Zbl 1407.68123
[8] Barceló Baeza, P.: Querying graph databases. In: PODS, pp 175-188 (2013)
[9] Bienvenu, M., Calvanese, D., Ortiz, M., Simkus, M.: Nested regular path queries in description logics. In: KR (2014) · Zbl 1336.68243
[10] Bienvenu, M., Ortiz, M., Šimkus, M.: Conjunctive regular path queries in lightweight description logics. In: IJCAI, pp 761-767 (2013) · Zbl 1336.68243
[11] Bienvenu, M., Ortiz, M., Šimkus, M.: Navigational queries based on frontier-guarded datalog Preliminary results. In: Alberto Mendelzon International Workshop on Foundations of Data Management, p 162 (2015) · Zbl 0632.68094
[12] Bourhis, P., Krötzsch, M., Rudolph, S.: How to best nest regular path queries. In: Description Logics (2014)
[13] Bourhis, P., Krötzsch, M., Rudolph, S.: Query containment for highly expressive datalog fragments. preprint arXiv: 1406.7801 (2014)
[14] Bourhis, P., Krötzsch, M., Rudolph, S.: Reasonable highly expressive query languages. In: IJCAI, pp 2826-2832 (2015)
[15] Buneman, P.: Semistructured data. In: Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, pp 117-121 (1997)
[16] Buneman, P., Davidson, S., Hillebrand, G., Suciu, D.: A query language and optimization techniques for unstructured data. ACM SIGMOD Rec 25(2), 505-516 (1996) · doi:10.1145/235968.233368
[17] Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M.: Containment of conjunctive regular path queries with inverse. In: 7th International Conference on Principles of Knowledge Representation and Reasoning (KR), pp 176-185 (2000) · Zbl 1015.68083
[18] Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M.: Rewriting of regular expressions and regular path queries. J. Comput. Syst. Sci 64(3), 443-465 (2002) · Zbl 1015.68083 · doi:10.1006/jcss.2001.1805
[19] Calvanese, D., De Giacomo, G., Vardi, M.Y.: Decidable containment of recursive queries. Theor. Comput. Sci 336(1), 33-56 (2005) · Zbl 1101.68513 · doi:10.1016/j.tcs.2004.10.031
[20] Calvanese, D., Giacomo, G.D., Lenzerini, M.: Conjunctive query containment and answering under description logic constraints. ACM Trans. Comput. Log 9(3) (2008) · Zbl 1367.68084
[21] Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Proceedings of the ninth annual ACM symposium on Theory of computing, pp 77-90 (1977)
[22] Chaudhuri, S., Krishnamurthy, R., Potamianos, S., Shim, K.: Optimizing queries with materialized views. In: 2013 IEEE 29Th International Conference on Data Engineering (ICDE), p 1995 · Zbl 0944.68046
[23] Chaudhuri, S., Vardi, M.Y.: On the equivalence of recursive and nonrecursive datalog programs. In: Proceedings of the eleventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, pp 55-66 (1992) · Zbl 1101.68513
[24] Chekuri, C., Rajaraman, A.: Conjunctive query containment revisited. Theor. Comput. Sci 239(2), 211-229 (2000) · Zbl 0944.68046 · doi:10.1016/S0304-3975(99)00220-0
[25] Consens, M., Mendelzon, A.: Graphlog: A visual forMalism for real life recursion. In: 9th ACM Symposium on Principles of Database Systems (PODS), pp 404-416 (1990)
[26] Courcelle, B.: The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Inf. Comput 85(1), 12-75 (1990) · Zbl 0722.03008 · doi:10.1016/0890-5401(90)90043-H
[27] Courcelle, B.: Recursive queries and context-free graph grammars. Theor. Comput. Sci 78(1), 217-244 (1991) · Zbl 0716.68026 · doi:10.1016/0304-3975(51)90009-6
[28] Cruz, I., Mendelzon, A., Wood, P.: A graphical query language supporting recursion. In: ACM Special Interest Group on Management of Data 1987 Annual Conference (SIGMOD), pp 323-330 (1987)
[29] Dalmau, V., Kolaitis, P.G., Vardi, M.Y.: Constraint satisfaction, bounded treewidth, and finite-variable logics. In: Principles and Practice of Constraint Programming-CP 2002, pp 310-326. Springer (2002)
[30] Fagin, R., Vardi, M.Y.: The theory of data dependenciesan overview. In: Automata, Languages and Programming, pp 1-22. Springer (1984) · Zbl 0563.68078
[31] Fan, W.: Graph pattern matching revised for social network analysis. In: Proceedings of the 15th International Conference on Database Theory, pp 8-21 (2012) · Zbl 1360.68394
[32] Fernandez, M., Florescu, D., Levy, A., Suciu, D.: Verifying integrity constraints on web sites. In: IJCAI, vol. 99, pp 614-619 (1999)
[33] Fletcher, G.H., Gyssens, M., Leinders, D., Surinx, D., Van den Bussche, J., Van Gucht, D., Vansummeren, S., Wu, Y.: Relative expressive power of navigational querying on graphs. Inf. Sci 298, 390-406 (2015) · Zbl 1360.68394 · doi:10.1016/j.ins.2014.11.031
[34] Florescu, D., Levy, A.Y., Mendelzon, A.O.: Database techniques for the world-wide web: A survey. SIGMOD Rec 27(3), 59-74 (1998) · doi:10.1145/290593.290605
[35] Friedman, M., Levy, A.Y., Millstein, T.D., etal: Navigational plans for data integration. AAAI/IAAI 1999, 67-73 (1999)
[36] Imielinski, T., Lipski, W.: Incomplete information in relational databases. J. ACM 31(4), 761-791 (1984) · Zbl 0632.68094 · doi:10.1145/1634.1886
[37] Kostylev, E.V., Reutter, J.L., Vrgoc, D.: Containment of data graph queries. In: ICDT, pp 131-142 (2014) · Zbl 1380.68168
[38] Lacroix, Z., Murthy, H., Naumann, F., Raschid, L.: Links and paths through life sciences data sources. In: Data Integration in the Life Sciences, pp 203-211. Springer (2004) · Zbl 1122.68404
[39] Libkin, L., Martens, W., Vrgoč, D.: Querying graph databases with xpath. In: Proceedings of the 16th International Conference on Database Theory, pp 129-140 (2013)
[40] Libkin, L., Reutter, J., Vrgoč, D.: Trial for rdf: adapting graph query languages for rdf data. In: PODS, pp 201-212 (2013)
[41] Pérez, J., Arenas, M., Gutierrez, C.: nSPARQL: A navigational language for RDF. J. Web Semant 8(4), 255-270 (2010) · doi:10.1016/j.websem.2010.01.002
[42] Reutter, J.L., Romero, M., Vardi, M.Y.: Regular queries on graph databases. In: 18th International Conference on Database Theory (ICDT 2015), vol. 31, pp 177-194 (2015) · Zbl 1365.68219
[43] Rudolph, S., Krötzsch, M.: Flag & check: Data access with monadically defined queries. In: Proceedings of the 32nd symposium on Principles of database systems, pp 151-162 (2013)
[44] Sagiv, Y., Yannakakis, M.: Equivalences among relational expressions with the union and difference operators. J. ACM (JACM) 27(4), 633-655 (1980) · Zbl 0456.68123 · doi:10.1145/322217.322221
[45] Ullman, J.D.: Principles of database and Knowledge-Base systems computer science press (1989)
[46] Vorobyov, S., Voronkov, A.: Complexity of nonrecursive logic programs with complex values. In: PODS, pp 244-253 (1998)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.