# zbMATH — the first resource for mathematics

Sequences, datalog, and transducers. (English) Zbl 0917.68053
Summary: This paper develops a query language for sequence databases, such as genome databases and text databases. The language, called Sequence Datalog, extends classical Datalog with interpreted function symbols for manipulating sequences. It has both a clear operational and declarative semantics, based on a new notion called the extended active domain of a database. The extended domain contains all the sequences in the database and all their subsequences. This idea leads to a clear distinction between safe and unsafe recursion over sequences: safe recursion stays inside the extended active domain, while unsafe recursion does not. By carefully limiting the amount of unsafe recursion, the paper develops a safe and expressive subset of Sequence Datalog. As part of the development, a new type of transducer is introduced, called a generalized sequence transducer. Unsafe recursion is allowed only within these generalized transducers. Generalized transducers extend ordinary transducers by allowing them to invoke other transducers as “subroutines”. Generalized transducers can be implemented in Sequence Datalog in a straightforward way. Moreover, their introduction into the language leads to simple conditions that guarantee safety and finiteness. This paper develops two such conditions. The first condition expresses exactly the class of PTIME sequence functions, and the second expresses exactly the class of elementary sequence functions. $$\copyright$$ Academic Press.

##### MSC:
 68P15 Database theory
##### Software:
Datalog; LOGIDATA+
Full Text:
##### References:
 [1] Abiteboul, S.; Hull, R.; Vianu, V., Foundations of Databases, (1995), Addison-Wesley Reading [2] Abiteboul, S.; Kanellakis, P., Object identity as a query language primitive, ACM SIGMOD International Conference on Management of Data, (1989), p. 159-173 [3] Albano, A.; Cardelli, L.; Orsini, R., Galileo: A strongly typed interactive conceptual language, ACM Trans. Database Syst., 10, 230-260, (1985) [4] Apt, K.; Blair, H.; Walker, A., Towards a theory of declarative knowledge, Foundations of Deductive Databases and Logic Programming, (1988), Morgan Kauffman Los Altos, p. 89-148 [5] Atkinson, M.; Bancilhon, F.; DeWitt, D.; Dittrich, K.; Maier, D.; Zdonik, Z., The object-oriented database manifesto, First International Conference on Deductive and Object Oriented Databases (DOOD’89), Kyoto, Japan, (1989), p. 40-57 [6] Atzeni, P., LOGIDATA_+, Lecture Notes in Computer Science, 701, (1993), Springer-Verlag Berlin/New York [7] Bancilhon, F.; Cluet, S.; Delobel, C., A query language for theO_2, Second International Workshop on Database Programming Languages (DBPL’89), (1989), p. 122-138 [8] Bellantoni, S.; Cook, S., A new recursion-theoretic characterization of the polytime functions, ACM International Symposium on Theory of Computing, (1992), p. 283-293 · Zbl 0766.68037 [9] Bonner, A. J., Hypothetical Datalog: complexity and expressibility, Theor. Comput. Sci., 76, 3-51, (1990) · Zbl 0702.68044 [10] Bonner, A. J.; Mecca, G., Querying string databases with transducers, International Workshop on Database Programming Languages (DBPL), (1997) [11] Breazu-Tannen, V.; Buneman, P.; Naqvi, S., Structural recursion as a query language, Third International Workshop on Database Programming Languages (DBPL’91), (1991), p. 9-19 [12] Cattel, R. G.G., The Object Database Standard: ODMG-93, (1994), Morgan Kaufmann San Francisco [13] Chandra, A. K.; Harel, D., Computable queries for relational databases, J. Compute. Syst. Sci., 21, 333-347, (1980) [14] Colby, L. S.; Robertson, E. L.; Saxton, L. V.; Van Gucht, D., A query language for List-based complex objects, Thirteenth ACM SIGMOD International Symposium on Principles of Database Systems (PODS’94), (1994), p. 179-189 [15] Commun. ACM, 34, (1991) [16] Ginsburg, S.; Wang, X., Pattern matching by RS-operations: towards a unified approach to querying sequence data, Eleventh ACM SIGACT SIGMOD SIGART Symposium on Principles of Database Systems (PODS’92), (1992), p. 293-300 [17] G. H. Gonnet, Text dominated databases: Theory, practice, and experience, 1994 [18] N. Goodman, Research issues in Genome databases, 1995 [19] Graedel, E.; Otto, M., Inductive definability with counting on finite structures, Proceedings of Computer Science Logic, Lecture Notes in Computer Science, 702, (1993), Springer-Verlag Berlin/ New York, p. 231-247 · Zbl 0792.68061 [20] Grahne, G.; Nykanen, M.; Ukkonen, E., Reasoning about strings in databases, Thirteenth ACM SIGMOD International Symposium on Principles of Database Systems (PODS’94), (1994), p. 303-312 [21] Grumbach, S.; Milo, T., Towards tractable algebras for bags, Twelfth ACM SIGMOD International Symposium on Principles of Database Systems (PODS’93), Washington, DC, (1993), p. 49-58 [22] Grumbach, S.; Milo, T., An algebra for POMSETS, Fifth International Conference on Database Theory, (ICDT’95), Prague, Lecture Notes in Computer Science, (1995), Springer-Verlag Berlin/New York, p. 191-207 [23] Hegelsen, C.; Sibbald, P. R., PALM—A pattern language for molecular biology, First International Conference on Intelligent Systems for Molecular Biology, (1993), p. 172-180 [24] Lloyd, J. W., Foundations of Logic Programming, (1987), Springer-Verlag Berlin/New York · Zbl 0547.68005 [25] Mecca, G., From Datalog to Sequence Datalog: Languages and Techniques for Querying Sequence Databases, (1996), Università di Roma “La Sapienza”Dipartimento di Informatica e Sistemistica [26] Papadimitriou, C. H., Computational Complexity, (1994), Addison-Wesley Reading · Zbl 0557.68033 [27] Richardson, J., Supporting lists in a data model (a timely approach), Eighteenth International Conference on Very Large Data Bases (VLDB’92), Vancouver, Canada, (1992), p. 127-138 [28] Searls, D. B., Technical report, (1993) [29] Stott Parker, D.; Simon, E.; Valduriez, P., SVP—A model capturing sets, streams and parallelism, Eighteenth International Conference on Very Large Data Bases (VLDB’92), Vancouver, Canada, (1992), p. 115-126 [30] Tarski, A., A lattice-theoretical fixpoint theorem and its applications, Pacific J. Math., 5, 285-309, (1955) · Zbl 0064.26004 [31] ACM SIGMOD Record, 19, 31-44, (1990) [32] Vandenberg, S.; De Witt, D., Algebraic support for complex objects with arrays, identity and inheritance, ACM SIGMOD International Conference on Management of Data, (1991), p. 158-167 [33] Vardi, M., The complexity of relational query languages, Fourteenth ACM SIGACT Symposium on Theory of Computing, (1988), p. 137-146 [34] Wang, X., Pattern matching by RS-operations: Towards a unified approach to querying sequence data, (1992), University of Southern California [35] Watson, J. D., Molecular Biology of the Gene, (1987), Benjamin and Cummings Menlo Park [36] Wolper, P., Temporal logic can be more expressive, Inform. Control, 56, (1983) · Zbl 0534.03009
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.