zbMATH — the first resource for mathematics

Finding common motifs with gaps using finite automata. (English) Zbl 1160.68683
Ibarra, Oscar H. (ed.) et al., Implementation and application of automata. 11th international conference, CIAA 2006, Taipei, Taiwan, August 21–23, 2006. Proceedings. Berlin: Springer (ISBN 978-3-540-37213-4/pbk). Lecture Notes in Computer Science 4094, 69-77 (2006).
Summary: We present an algorithm that uses finite automata to find the common motifs with gaps occurring in all strings belonging to a finite set \(S = \{ S _{1}, S _{2},\dots,S _{ r }\}\). In order to find these common motifs we must first identify the factors that exist in each string. Therefore the algorithm begins by constructing a factor automaton for each string \(S _{ i }\). To find the common factors of all the strings, the algorithm needs to gather all the factors from the strings together in one data structure and this is achieved by computing an automaton that accepts the union of the above-mentioned automata. Using this automaton we are able to create a new factor alphabet. Based on this factor alphabet a finite automaton is created for each string \(S _{ i }\) that accepts sequences of all non overlapping factors residing in each string. The intersection of the latter automata produces the finite automaton which accepts all the common subsequences with gaps over the factor alphabet that are present in all the strings of the set \(S = \{ S _{1}, S _{2},\dots,S _{ r }\}\). These common subsequences are the common motifs of the strings.
For the entire collection see [Zbl 1113.68007].

68W05 Nonnumerical algorithms
68Q45 Formal languages and automata
68W32 Algorithms on strings
Full Text: DOI