# zbMATH — the first resource for mathematics

Clustering rules: A comparison of partitioning and hierarchical clustering algorithms. (English) Zbl 1104.62073
Summary: Previous research has resulted in a number of different algorithms for rule discovery. Two approaches discussed here, the ‘all-rules’ algorithm and multi-objective metaheuristics, both result in the production of a large number of partial classification rules, or ‘nuggets’, for describing different subsets of the records in the class of interest. This paper describes the application of a number of different clustering algorithms to these rules, in order to identify similar rules and to better understand the data.

##### MSC:
 62H30 Classification and discrimination; cluster analysis (statistical aspects) 68T05 Learning and adaptive systems in artificial intelligence 68T37 Reasoning under uncertainty in the context of artificial intelligence 90C59 Approximation methods and heuristics in mathematical programming
##### Keywords:
CLARANS; confidence; coverage; genetic algorithms; silhouettes
##### Software:
clusfind; MOCK; UCI-ml
Full Text:
##### References:
 [1] Bayardo, Jr., R. J. and Agrawal, R.: Mining the most interesting rules, in S. Chaudhuri and D. Madigan (eds.), Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, California, United States, 1999, pp. 145–154. [2] Bayardo, Jr., R. J., Agrawal, R. and Gunopulos, D.: Constraint-based rule mining in large, dense databases, in Proceedings of the 15th International Conference on Data Engineering, Sydney, Australia, 1999, pp. 188–197. [3] Blake, C. and Merz, C.: ’UCI Repository of machine learning databases,’ (1998), http://www.ics.uci.edu/$$\sim$$mlearn/MLRepository.html . [4] Chu, S. C., Roddick, J. F. and Pan, J. S.: A comparative study and extensions to k-medoids algorithms, in Fifth International Conference on Optimization, Hong Kong, China, 2001, pp. 1708–1717. [5] de la Iglesia, B., Philpott, M. S., Bagnall, A. J. and Rayward-Smith, V. J.: Data mining rules using multi-objective evolutionary algorithms, in R. Sarker, R. Reynolds, H. Abbass, K. C. Tan, B. McKay, D. Essam, and T. Gedeon (eds.), Proceedings of 2003 IEEE Congress on Evolutionary Computation, Canberra, Australia, 2003, pp. 1552–1559. [6] de la Iglesia, B., Reynolds, A. and Rayward-Smith, V. J.: Developments on a Multi-Objective Metaheuristic (MOMH) algorithm for finding interesting sets of classification rules, in C. A. Coello Coello, A. H. Aguirre and E. Zitzler (eds.), Evolutionary Multi-Criterion Optimization: Third International Conference, EMO 2005, Guanajuato, Mexico, 2005, pp. 826–840. · Zbl 1109.68645 [7] Deb, K.: Multi-Objective Optimization using Evolutionary Algorithms, Chichester, Wiley, England, 2001. · Zbl 0970.90091 [8] Deb, K., Agrawal, S., Pratab, A. and Meyarivan, T.: A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: NSGA-II,’ in Marc Schoenauer and Kalyanmoy Deb, Günter Rudolph, Xin Yao, Evelyne Lutton, J. J. Merelo, Hans-Paul Schwefel (eds.), Proceedings of the Parallel Problem Solving from Nature VI Conference. Lecture Notes in Computer Science No. 1917, Paris, France, 2000, pp. 849–858. [9] Gower, J. C. and Legendre, P.: Metric and Euclidean properties of dissimilarity coefficients, J. Classif. 3 (1986), 5–48. · Zbl 0592.62048 · doi:10.1007/BF01896809 [10] Handl, J. and Knowles, J.: Evolutionary multiobjective clustering, in X. Yao, E. Burke, J. Lozano, J. Smith, J. Merelo-Guervs, J. Bullinaria, J. Rowe, P. Tino, A. Kabn, and H.-P. Schwefel (eds.), Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature (PPSN VIII). Birmingham, UK, 2004, pp. 1081–1091. [11] Handl, J. and Knowles, J.: Exploiting the trade-off – the benefits of multiple objectives in data clustering, in C. A. Coello Coello, A. H. Aguirre and E. Zitzler (eds.), Evolutionary Multi-Criterion Optimization: Third International Conference, EMO 2005, Guanajuato, Mexico, 2005, pp. 547–560. · Zbl 1109.68601 [12] Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et des Jura, Bull. Soc. Vaud. Sci. Nat. 37 (1901), 547–579. [13] Kaufman, L. and Rousseeuw, P. J.: Finding Groups in Data: An Introduction to Cluster Analysis, Wiley series in probability and mathematical statistics, Wiley, 1990. · Zbl 1345.62009 [14] MacQueen, J. B.: Some methods for classification and analysis of multivariate observations, in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, 1967, pp. 281–297. · Zbl 0214.46201 [15] Ng, R. T. and Han, J.: CLARANS: a method for clustering objects for spatial data mining, IEEE Trans. Knowl. Data Eng. 14(5) (2002), 1003–1016. · Zbl 05109550 · doi:10.1109/TKDE.2002.1033770 [16] Reynolds, A. P., Richards, G. and Rayward-Smith, V. J.: The application of K-medoids and PAM to the clustering of rules, in Z. R. Yang, H. Yin, and R. Everson (eds.), in Proceedings of the Fifth International Conference on Intelligent Data Engineering and Automated Learning (IDEAL’04), 2004, pp. 173–178. [17] Richards, G. and Rayward-Smith, V. J.: Discovery of association rules in tabular data, in N. Cercone, T. Y. Lin and X. Wu (eds.), in Proceedings of IEEE First International Conference on Data Mining, San Jose, California, USA, San Jose, California, 2001, pp. 465–473. [18] Sokal, R. R. and Michener, C. D.: A statistical method for evaluating systematic relationships, Univ. Kans. Sci. Bull. 38 (1958), 1409–1438. [19] Sokal, R. R. and Sneath, P. H. A.: Principles of Numerical Taxonomy, Freeman, San Francisco, 1963. · Zbl 0285.92001
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.