zbMATH — the first resource for mathematics

RSS query algebra: towards a better news management. (English) Zbl 1320.68067
Summary: Existing XML query algebras are not fully appropriate to retrieve RSS news items mainly due to three reasons: (1) RSS document is text rich and its content is dependent on the wording and verification of the author, thus semantic-aware operators are needed; (2) news items are dynamic and consequently time oriented retrieval is needed; and (3) a news item may evolve through time, or overlap with other news items and hence identifying relationships between items is needed. In this paper, we aim to solve these issues by providing a dedicated RSS algebra based on semantic-aware operators which are capable of considering RSS characteristics. The provided operators are application domain specific and can be tuned according to the user preferences. We also provide a set of query rewriting and equivalence rules that would be used during query simplification and optimization. In addition and in order to validate our proposal, we develop a prototype called EasyRSSManager that allows a user to formulate RSS query using our operators.
68P20 Information storage and retrieval of data
Full Text: DOI
[1] Hammersley, B., Content syndication with RSS, (2003), O’Reilly Media, Inc. San Francisco, USA
[2] H. Liu, V. Ramasubramanian, E.G. Sirer, Client behavior and feed characteristics of RSS, a publish-subscribe system for web micronews, in: Proceedings of the 5th ACM SIGCOMM Conference on internet Measurement, 2005, pp. 29-34.
[3] J. Robie, D. Chamberlin, M. Dyck, J. Snelson, World Wide Web Consortium (W3C). <http://www.w3.org/TR/xquery-11/>, December 2009.
[4] C. Sartiani, A. Albano, Yet another query algebra for XML data, in: IDEAS, 2002, pp. 106-115.
[5] B. Babcock, M. Datar, R. Motwani, Sampling from a moving window over streaming data, in: SODA, 2002, pp. 633-634. · Zbl 1093.68571
[6] J. Clark, S. DeRose, XML Path Language (XPath) Version 1.0. W3C Recommendation. <http://www.w3.org/TR/xpath/>, November 1999.
[7] Y. Bai, H. Thakkar, H. Wang, C. Luo, C. Zaniolo, A data stream language and system designed for power and extensibility, in: CIKM, 2006, pp. 337-346.
[8] Korn, F.; Pagel, B.; Faloutsos, C., On the ‘dimensionality curse’ and the ‘self-similarity blessing’, IEEE Trans. Knowl. Data Eng., 13, 1, 96-111, (2001)
[9] WordNet 2.1. A Lexical Database of the English Language. <http://wordnet.princeton.edu/online/>, 2005.
[10] Getahun, F.; Tekli, J.; Chbeir, R.; Viviani, M.; Yétongnon, K., Semantic-based merging of RSS items. World Wide Web, 13, 1-2, 169-207, (2010)
[11] Horn, A., On sentences which are true of direct unions of algebras, J. Symbolic Logic, 16, 14-21, (1951) · Zbl 0043.24801
[12] McGill, M. J., Introduction to modern information retrieval, (1983), McGraw-Hil New York · Zbl 0523.68084
[13] F. Frasincar, G. Houben, C. Pau, XAL: an algebra for XML query optimization, in: Proceedings of the 13th Australasian Database Conference, vol. 5, Melbourne, Victoria, Australia, 2002, pp. 49-56.
[14] J. Pérez, M. Arenas, C. Gutierrez, Semantics and complexity of SPARQL, in: International Semantic Web Conference, vol. 34(3), 2009, pp. 1-45.
[15] D. Fisher, F. Lam, R.K. Wong, Algebraic transformation and optimization for XQuery, in: Advanced Web Technologies and Applications (Proc. 6th Asia-Pacific Web Conference), vol. 3007, April 2004, pp. 201-210.
[16] CBCL, XQSharp: XQuery 1.0 for the Microsoft.NET Framework. <http://xqsharp.com/xqsharp/>, 2009.
[17] Porter, M. F., An algorithm for suffix stripping, Program, 14, 3, 130-137, (1980)
[18] A. Gulli, AG’s Corpus of News Articles. <http://www.di.unipi.it/∼gulli/AG_corpus_of_news_articles.html>, 2004
[19] E.F. Codd. A relational model of data for large shared data banks. Commun. ACM 13, 6 (June 1970) 377-387. · Zbl 0207.18003
[20] C. Kanne, G. Moerkotte, Efficient storage of XML data, in: Proceedings of the 16th International Conference on Data Engineering, Washington, DC, 2000, p. 198.
[21] I. Manolescu, D. Florescu, D. Kossmann, F. Xhumari, D. Olteanu, Agora: living with XML and relational, in: Proceedings of the 26th Int. Conf. on Very Large Data Bases (VLDB), Cairo, Egypt, 2000, pp. 623-626.
[22] J. Shanmugasundaram et al., Relational databases for querying XML documents: limitations and opportunities, in: Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, 1999, pp. 302-314.
[23] A. Schmidt, M. Kersten, M. Windhouwer, F. Waas, Efficient relational storage and retrieval of XML documents, in: Proceedings of the 3rd International Workshop on the Web and Database, Dallas, Texas, 2000, p. 47-52.
[24] C. Traina, A.J. Traina, M.R. Vieira, A.S. Arantes, C. Faloutsos, Efficient processing of complex similarity queries in rdbms through query rewriting, in: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, Arlington, Virginia, USA, 2006, pp. 4-13.
[25] Xin Zhang, Elke A. Rundensteiner, Gail Mitchell, Wang-Chien Lee, Clock: synchronizing internal relational storage with external XML documents, in: 11th International Workshop on Research Issues in Data Engineering, 2001, pp. 111-118.
[26] B. Catania, E. Ferrari, A.Y. Levy, A.O. Mendelzon, XML and object technology, in: ECOOP Workshops 2000, 2000, pp. 191-202.
[27] T. Shimura, M. Yoshikawa, S. Uemura, Storage and retrieval of XML documents using object-relational databases, in: Proceedings of Database and Expert Systems Applications (DEXA), 1999, p. 206-217.
[28] Naughton, J. F., The Niagara Internet query system, IEEE Data Eng. Bull., 24, 2, 27-33, (2001)
[29] X. Rundensteiner, E. Zhang, XAT: XML Algebra for the Rainbow System, WPI, Technical Report, 2002.
[30] D. Chamberlin, J. Robie, D. Florescu, Quilt: an XML query language for heterogeneous data sources, in: Third international Workshop WebDB 2000 on the World Wide Web and Databases (Selected Paper), 2000, pp. 1-25.
[31] M.F. Fernández, J. Siméon, P. Wadler, An algebra for XML query, in: Foundations of Software Technology and Theoretical Computer Science, 2000, pp. 11-45.
[32] H.V. Jagadish, L.V. Lakshmanan, D. Srivastava, K. Thompson, D. Srivastava., TAX: a tree algebra for XML, in: Proceedings of the 8th International Workshop on Database Programming Language, vol. 2397, London, 2001, pp. 149-164. · Zbl 1098.68553
[33] L. Novak, A.V. Zamulin, An XML algebra for XQuery, in: 10th East European Conference on Advances in Databases and Information Systems (ADBIS 06), vol. 4152, Thessaloniki, Greece, 2006, p. 4-21.
[34] D. Beech, A. Malhotra, M. Rys, A formal data model and algebra for XML, 1999.
[35] Z. Chen, H.V. Jagadish, L.V.S. Lakshmanan, S. Paparizos, From tree patterns to generalized tree patterns: on efficient evaluation of XQuery, in: VLDB, 2003, pp. 237-248.
[36] S. Paparizos, Y. Wu, L.V. Lakshmanan, H.V. Jagadish, Tree logical classes for efficient evaluation of XQuery, in: ACM SIGMOD international Conference on Management of Data, Paris, France, 2004, pp. 71-82.
[37] E., Hung, Y. Deng, V.S. Subrahmanian, TOSS: an extension of TAX with ontologies and similarity queries, in: SIGMOD ’04, Paris, France, 2004, pp. 719-730.
[38] S. Cohen, J. Mamou, Y. Kanza, Y. Sagiv, XSEarch: a semantic search engine for XML, in: 29th Inter. Conf. on VLDB, vol. 29, 2003, pp. 45-56.
[39] L. Guo, F. Shao, C. Botev, J. Shanmugasundaram, XRANK: ranked keyword search over XML documents, in: SIGMOD Inter. Conf. on Management of Data, 2003, pp. 16-27.
[40] M. Theobald, R. Schenkel, G. Weikum, An efficient and versatile query engine for TopX search, in: 31st Inter. Conf. on VLDB, 2005, pp. 625-636.
[41] K. Patroumpas, T.K. Sellis, Window specification over data streams, in: EDBT Workshops, 2006, pp. 445-464.
[42] S. Bergamaschi, F. Guerra, M. Orsini, C. Sartori, M. Vincini, RELEVANT news: a semantic news feed aggregator, in: 4th Workshop on Semantic Web Applications and Perspectives (SWAP 2007), Bari, Italy, 2007, pp. 150-159.
[43] Baeza-Yates, R.; Ribeiro-Neto, B., Modern information retrieval, (1999), Addison Wesley
[44] I. Botan et al., Extending XQuery with window functions, in: Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria, 2007, pp. 75-86.
[45] F. Getahun, J. Tekli, S. Atnafu, R. Chbeir, Towards efficient horizontal multimedia database fragmentation using semantic-based predicates implication, in: SBBD 2007, 2007, pp. 68-82.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.