Structural similarity search for mathematics retrieval. (English) Zbl 1390.68258

Carette, Jacques (ed.) et al., Intelligent computer mathematics. MKM, Calculemus, DML, and systems and projects 2013, held as part of CICM 2013, Bath, UK, July 8–12, 2013. Proceedings. Berlin: Springer (ISBN 978-3-642-39319-8/pbk). Lecture Notes in Computer Science 7961. Lecture Notes in Artificial Intelligence, 246-262 (2013).
Summary: Retrieving documents by querying their mathematical content directly can be useful in various domains, including education, engineering, patent research, physics, and medical sciences. As distinct from text retrieval, however, mathematical symbols in isolation do not contain much semantic information, and the structure of an expression must be considered as well. Unfortunately, considering the structure to calculate the relevance scores of documents results in ranking algorithms that are computationally more expensive than the typical ranking algorithms employed for text documents. As a result, current math retrieval systems either limit themselves to exact matches, or they ignore the structure completely; they sacrifice either recall or precision for efficiency. We propose instead an efficient end-to-end math retrieval system based on a structural similarity ranking algorithm. We describe novel optimizations techniques to reduce the index size and the query processing time, and we experimentally validate our system in terms of correctness and efficiency. Thus, with the proposed optimizations, mathematical contents can be fully exploited to rank documents in response to mathematical queries.
For the entire collection see [Zbl 1268.68008].


68P20 Information storage and retrieval of data
68U35 Computing methodologies for information systems (hypertext navigation, interfaces, decision support, etc.)


Tralics; EgoMath
Full Text: DOI