Discovering repetitions in strings. (English) Zbl 0604.68074

Combinatorial algorithms on words, Proc. NATO Adv. Res. Workshop, Maratea/Italy 1984, NATO ASI Ser., Ser. F 12, 279-288 (1985).
[For the entire collection see Zbl 0564.00027.]
In the present paper we employ the fingerprinting method to solve the following string matching problem. Given a string y we want to find the earliest repetition, i.e. the shortest w and x such that \(y=wxxz\). We call this the repetition problem.
We give a very simple algorithm for discovering repetitions by use of fingerprints. If \(y\in \Sigma^*\), \(| \Sigma | =s\), \(| y| =m\) then the expected running time is O(m log\({}_ 2m \log_ sm)\). But by further use of the power of encoding several operations into one machine-word operation, the running time can be reduced to O(m log\({}_ 2m)\), and perhaps even further. An additional feature of this algorithm is its parallelizability. An \(n\leq m\) processor machine would produce approximately an \(n^{-1}\) reduction in running time. This would produce practical improvements even for small values of n.


68P10 Searching and sorting
68Q25 Analysis of algorithms and problem complexity
68T99 Artificial intelligence


Zbl 0564.00027