XCleaner swMATH ID: 12721 Software Authors: Brzeziński, Dariusz; Leśniewska, Anna; Morzy, Tadeusz; Piernik, Maciej Description: XCleaner: A new method for clustering XML documents by structure. With the vastly growing data resources on the Internet, XML is one of the most important standards for document management. Not only does it provide enhancements to document exchange and storage, but it is also helpful in a variety of information retrieval tasks. Document clustering is one of the most interesting research areas that utilize semi-structural nature of XML. In this paper, we put forward a new XML clustering algorithm that relies solely on document structure. We propose the use of maximal frequent subtrees and an operator called Satisfy/Violate to divide documents into groups. The algorithm is experimentally evaluated on real and synthetic data sets with promising results. Homepage: http://yadda.icm.edu.pl/yadda/element/bwmeta1.element.baztech-article-BATC-0009-0016 Keywords: XML; clustering; patterns Related Software: ToXgene; Xproj Cited in: 1 Publication Standard Articles 1 Publication describing the Software, including 1 Publication in zbMATH Year XCleaner: a new method for clustering XML documents by structure. Zbl 1318.68137Brzeziński, Dariusz; Leśniewska, Anna; Morzy, Tadeusz; Piernik, Maciej 2011 Cited by 4 Authors 1 Brzeziński, Dariusz W. 1 Leśniewska, Anna 1 Morzy, Tadeusz 1 Piernik, Maciej Cited in 1 Serial 1 Control and Cybernetics Cited in 1 Field 1 Computer science (68-XX) Citations by Year