Forbidden patterns. (English) Zbl 1353.68066

Fernández-Baca, David (ed.), LATIN 2012: Theoretical informatics. 10th Latin American symposium, Arequipa, Peru, April 16–20, 2012. Proceedings. Berlin: Springer (ISBN 978-3-642-29343-6/pbk). Lecture Notes in Computer Science 7256, 327-337 (2012).
Summary: We consider the problem of indexing a collection of documents (a.k.a. strings) of total length \(n\) such that the following kind of queries are supported: given two patterns \(P ^{ + }\) and \(P ^{ -}\), list all \({n_{\text match}}\) documents containing \(P ^{ + }\) but not \(P ^{- }\). This is a natural extension of the classic problem of document listing as considered by S. Muthukrishnan [in: Proceedings of the 13th annual ACM-SIAM symposium on discrete algorithms, SODA’02. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM); New York, NY: Association for Computing Machinery (ACM). 657–665 (2002; Zbl 1093.68588)], where only the positive pattern \(P ^{ + }\) is given. Our main solution is an index of size \(O(n ^{3/2})\) bits that supports queries in \(O(|{P^+}|+|{P^-}|+n_{\mathrm{match}}+\sqrt{n})\) time.
For the entire collection see [Zbl 1239.68008].


68P15 Database theory
68P20 Information storage and retrieval of data
68W32 Algorithms on strings


Zbl 1093.68588
Full Text: DOI