×

A dynamic density-based clustering algorithm appropriate to large-scale text processing. (Chinese. English summary) Zbl 1289.68088

Summary: Because of the high time complexity and complicated parameter setting in traditional density-based clustering algorithms, a new density definition is proposed, which just needs one parameter and can find clusters with different densities. The authors also expand the algorithm to a two-stage dynamic density-based clustering algorithm, which can process large-scale text corpus data. Experiments on synthetic dataset, large-scale dataset from UCI, English text corpus and Chinese text corpus show that the proposed algorithm has the characteristic of easy parameter setting and high clustering efficiency, and can be applied to clustering process to large-scale text data.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68T10 Pattern recognition, speech recognition
68T50 Natural language processing
68U15 Computing methodologies for text processing; mathematical typography
PDFBibTeX XMLCite