zbMATH — the first resource for mathematics

Alphabet partitioning for compressed rank/select and applications. (English) Zbl 1310.68060
Cheong, Otfried (ed.) et al., Algorithms and computation. 21st international symposium, ISAAC 2010, Jeju, Korea, December 15–17, 2010. Proceedings, Part II. Berlin: Springer (ISBN 978-3-642-17513-8/pbk). Lecture Notes in Computer Science 6507, 315-326 (2010).
Summary: We present a data structure that stores a string \(s[1..n]\) over the alphabet \([1..\sigma ]\) in \(nH _{0}(s) + o(n)(H _{0}(s) + 1)\) bits, where \(H _{0}(s)\) is the zero-order entropy of \(s\). This data structure supports the queries access and rank in time \({\mathcal O}(\lg \lg \sigma)\), and the select query in constant time. This result improves on previously known data structures using \(nH_0(s)+o(n\lg\sigma)\) bits, where on highly compressible instances the redundancy \(o(n\lg\sigma)\) cease to be negligible compared to the \(nH _{0}(s)\) bits that encode the data. The technique is based on combining previous results through an ingenious partitioning of the alphabet, and practical enough to be implementable. It applies not only to strings, but also to several other compact data structures. For example, we achieve (i) faster search times and lower redundancy for the smallest existing full-text self-index; (ii) compressed permutations \(\pi \) with times for \(\pi ()\) and \(\pi ^{ - 1}()\) improved to log-logarithmic; and (iii) the first compressed representation of dynamic collections of disjoint sets.
For the entire collection see [Zbl 1202.68003].

68P05 Data structures
68W32 Algorithms on strings
Full Text: DOI