×

zbMATH — the first resource for mathematics

On suffix extensions in suffix trees. (English) Zbl 1251.68081
Summary: Suffix trees are inherently asymmetric: prefix extensions only cause a few updates, while suffix extensions affect all suffixes causing a wave of updates. In his elegant linear-time on-line suffix tree algorithm Ukkonen relaxed the prevailing suffix tree representation and introduced two changes to avoid repeated structural updates and circumvent the inherent complexity of suffix extensions: (1) open ended edges that enjoy gratuitous leaf updates, and (2) the omission of implicit nodes.
In this paper we study the implicit nodes as the suffix tree evolves. We partition the suffix tree’s edges into collections of similar edges called bands, where implicit nodes exhibit identical behavior, and generalize the notion of open ended edges to allow implicit nodes to “float” within bands, only requiring updates when moving from one band to the next, adding up to only \(O(n)\) updates.
We also show that internal implicit nodes are separated from each other by explicit suffix tree nodes and that all external implicit nodes are related to the same periodicity. These new properties may be used to keep track of the waves of implicit node updates and to build the suffix tree on-line in amortized linear time, providing access to all the implicit nodes in worst-case constant time.

MSC:
68P05 Data structures
68W32 Algorithms on strings
05C62 Graph representations (geometric and intersection representations, etc.)
05C70 Edge subsets with special properties (factorization, matching, partitioning, covering and packing, etc.)
Software:
PATRICIA
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Amir, A.; Farach, M.; Idury, R.; Poutré, J.L.; Schäffer, A., Improved dynamic dictionary-matching, Inform. and comput., 119, 258-282, (1995) · Zbl 0832.68033
[2] Apostolico, A., The myriad virtues of subword trees, (), 85-96 · Zbl 0572.68067
[3] Apostolico, A.; Preparata, F., Optimal off-line detection of repetitions in a string, Theoret. comput. sci., 22, 297-315, (1983) · Zbl 0497.68052
[4] Bender, M.A.; Cole, R.; Demaine, E.D.; Farach-Colton, M.; Zito, J., Two simplified algorithms for maintaining order in a List, (), 152-164 · Zbl 1019.68527
[5] Berstel, J., Transductions and context-free languages, (1979), Teubner-Verlag, Revised version is available electronically as http://www-igm.univ-mlv.fr/ berstel/LivreTransductions/LivreTransductions14dec2009.pdf · Zbl 0424.68040
[6] Blumer, A.; Blumer, J.; Haussler, D.; Ehrenfeucht, A.; Chen, M.; Seiferas, J., The smallest automaton recognizing the subwords of a text, Theoret. comput. sci., 40, 31-55, (1985) · Zbl 0574.68070
[7] Blumer, A.; Blumer, J.; Haussler, D.; McConnel, R.; Ehrenfeucht, A., Complete inverted files for efficient text retrieval and analysis, J. assoc. comput. Mach., 34, 3, 578-595, (1987)
[8] Breslauer, D.; Hariharan, R., Optimal parallel construction of minimal suffix and factor automata, Parallel process. lett., 6, 1, 35-44, (1996)
[9] Breslauer, D.; Italiano, G.F., Near real-time suffix tree construction via the fringe marked ancestor problem, (), 156-167 · Zbl 1267.68323
[10] Breslauer, D.; Italiano, G.F., On suffix extensions in suffix trees, (), 301-312 · Zbl 1251.68081
[11] Chen, M.; Seiferas, J., Efficient and elegant subword-tree construction, (), 97-107 · Zbl 0572.68069
[12] Cohen, H.; Porat, E., Range non-overlapping indexing, (), 1044-1053 · Zbl 1273.68097
[13] Crochemore, M., Transducers and repetitions, Theoret. comput. sci., 12, 63-86, (1986) · Zbl 0615.68053
[14] Crochemore, M.; Rytter, W., Text algorithms, (1994), Oxford University Press · Zbl 0844.68101
[15] Dietz, P.F.; Sleator, D.D., Two algorithms for maintaining order in a List, (), 365-372
[16] Dori, S.; Landau, G.M., Construction of Aho Corasick automaton in linear time for integer alphabets, Inf. process. lett., 98, 2, 66-72, (2006) · Zbl 1178.68310
[17] M. Farach, Optimal suffix tree construction with large alphabets, in: FOCS, 1997, pp. 137-143.
[18] Farach-Colton, M.; Ferragina, P.; Muthukrishnan, S., On the sorting-complexity of suffix tree construction, J. ACM, 47, 6, 987-1011, (2000) · Zbl 1094.68694
[19] Fine, N.; Wilf, H., Uniqueness theorems for periodic functions, Proc. amer. math. soc., 16, 109-114, (1965) · Zbl 0131.30203
[20] Fredkin, E., Trie memory, Commun. ACM, 3, 9, 490-499, (1960)
[21] Giegerich, R.; Kurtz, S., From ukkonen to mccreight and weiner: a unifying view of linear-time suffix tree construction, Algorithmica, 19, 3, 331-353, (1997) · Zbl 0895.68056
[22] Gusfield, D., Algorithms on strings, trees, and sequences - computer science and computational biology, (1997), Cambridge University Press · Zbl 0934.68103
[23] Gusfield, D.; Stoye, J., Linear time algorithms for finding and representing all the tandem repeats in a string, J. comput. syst. sci., 69, 4, 525-546, (2004) · Zbl 1076.68111
[24] Kärkkäinen, J.; Sanders, P.; Burkhardt, S., Linear work suffix array construction, J. ACM, 53, 6, 918-936, (2006) · Zbl 1326.68111
[25] Kim, D.K.; Sim, J.S.; Park, H.; Park, K., Constructing suffix arrays in linear time, J. discrete algorithms, 3, 2-4, 126-142, (2005) · Zbl 1101.68505
[26] Ko, P.; Aluru, S., Space efficient linear time construction of suffix arrays, J. discrete algorithms, 3, 2-4, 143-156, (2005) · Zbl 1101.68506
[27] R.M. Kolpakov, G. Kucherov, Finding maximal repetitions in a word in linear time, in: FOCS, 1999, pp. 596-604.
[28] Kosaraju, S., Computation of squares in a string, (), 146-150
[29] Lothaire, M., Algebraic combinatorics on words, (2005), Cambridge University Press · Zbl 1001.68093
[30] Manber, U.; Myers, E.W., Suffix arrays: A new method for on-line string searches, SIAM J. comput., 22, 5, 935-948, (1993) · Zbl 0784.68027
[31] McCreight, E., A space economical suffix tree construction algorithm, J. assoc. comput. Mach., 23, 262-272, (1976) · Zbl 0329.68042
[32] Morrison, D.R., PATRICIA — practical algorithm to retrieve information coded in alphanumeric, J. ACM, 15, 4, 514-534, (1968)
[33] Salson, M.; Lecroq, T.; Léonard, M.; Mouchard, L., A four-stage algorithm for updating a burrows – wheeler transform, Theoret. comput. sci., 410, 43, 4350-4359, (2009) · Zbl 1187.68685
[34] Salson, M.; Lecroq, T.; Léonard, M.; Mouchard, L., Dynamic extended suffix arrays, J. discrete algorithms, 8, 2, 241-257, (2010) · Zbl 1201.68044
[35] Shibuya, T., Constructing the suffix tree of a tree with a large alphabet, (), 225-236 · Zbl 0964.68034
[36] Tsakalidis, A.K., Maintaining order in a generalized linked List, Acta inf., 21, 101-112, (1984) · Zbl 0515.68038
[37] Ukkonen, E., On-line construction of suffix trees, Algorithmica, 14, 3, 249-260, (1995) · Zbl 0831.68027
[38] P. Weiner, Linear pattern matching algorithms, in: Proc. 14th Symposium on Switching and Automata Theory, 1973, pp. 1-11.
[39] Westbrook, J., Fast incremental planarity testing, (), 342-353
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.