Novel meta-heuristic algorithms for clustering web documents. (English) Zbl 1154.68492
Summary: Clustering the web documents is one of the most important approaches for mining and extracting knowledge from the web. Recently, one of the most attractive trends in clustering the high dimensional web pages has been tilt toward the learning and optimization approaches. In this paper, we propose novel hybrid harmony search based algorithms for clustering the web documents that finds a globally optimal partition of them into a specified number of clusters. By modeling clustering as an optimization problem, first, we propose a pure harmony search-based clustering algorithm that finds near global optimal clusters within a reasonable time. Then, we hybridize \(K\)-means and harmony clustering in two ways to achieve better clustering. Experimental results reveal that the proposed algorithms can find better clusters when compared to similar methods and also illustrate the robustness of the hybrid clustering algorithms.

68T10 Pattern recognition, speech recognition
68T05 Learning and adaptive systems in artificial intelligence
68M10 Network design and communication in computer systems
68W05 Nonnumerical algorithms
AntClust; C4.5; KAON
Full Text: DOI
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.