×

LCMine: An efficient algorithm for mining discriminative regularities and its application in supervised classification. (English) Zbl 1207.68128

Summary: We introduce an efficient algorithm for mining discriminative regularities on databases with mixed and incomplete data. Unlike previous methods, our algorithm does not apply an a priori discretization on numerical features; it extracts regularities from a set of diverse decision trees, induced with a special procedure. Experimental results show that a classifier based on the regularities obtained by our algorithm attains higher classification accuracy, using fewer discriminative regularities than those obtained by previous pattern-based classifiers. Additionally, we show that our classifier is competitive with traditional and state-of-the-art classifiers.

MSC:

68P15 Database theory
68T10 Pattern recognition, speech recognition
68W05 Nonnumerical algorithms

Software:

C4.5; UCI-ml
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] J. Ruiz-Shulcloper, M.A. Abidi, Logical combinatorial pattern recognition: a review, in: S. Pandalai (Ed.), Recent Research Developments in Pattern Recognition, Transword Research Networks, USA, 2002, pp. 133-176.; J. Ruiz-Shulcloper, M.A. Abidi, Logical combinatorial pattern recognition: a review, in: S. Pandalai (Ed.), Recent Research Developments in Pattern Recognition, Transword Research Networks, USA, 2002, pp. 133-176.
[2] Martnez-Trinidad, J. F.; Guzmin-Arenas, A., The logical combinatorial approach to pattern recognition, an overview through selected works, Pattern Recognition, 34, 741-751 (2001) · Zbl 0970.68140
[3] Bongard, M. N., Solution to geological problems with support of recognition programs, Sov. Geol., 6, 33-50 (1963)
[4] Ho, T. K., The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Mach. Intell., 20, 8, 832-844 (1998)
[5] Dong, G.; Li, J., Efficient mining of emerging patterns: discovering trends and differences, (Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (1999), ACM: ACM San Diego, CA, USA), 43-52
[6] J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, 1993. ISBN: 1558602402.; J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, 1993. ISBN: 1558602402.
[7] Craven, M. W.; Shavlik, J., Extracting tree-structured representations of trained networks, Advances in Neural Information Processing Systems, vol. 8 (1996), MIT Press: MIT Press Cambridge, MA
[8] D. Martens, B. Baesens, T. Van Gestel, J. Vanthienen, Comprehensible credit scoring models using rule extraction from support vector machines, European Journal of Operational Research 183 (3) (2007) 1466-1476.; D. Martens, B. Baesens, T. Van Gestel, J. Vanthienen, Comprehensible credit scoring models using rule extraction from support vector machines, European Journal of Operational Research 183 (3) (2007) 1466-1476. · Zbl 1278.91177
[9] Haykin, S., Neural Networks: A Comprehensive Foundation (1998), Prentice Hall PTR · Zbl 0828.68103
[10] Dasarathy, B. D., Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques (1991), IEEE Computer Society Press: IEEE Computer Society Press Los Alamitos, CA
[11] Cortes, C.; Vapnik, V., Support-vector networks, Mach. Learn., 20, 3, 273-297 (1995) · Zbl 0831.68098
[12] Kothari, R.; Dong, M., Decision trees for classification: a review and some new results, (Pal, S. K.; Pal, A., Pattern Recognition. From Classical to Modern Approaches (2001), World Scientific: World Scientific Singapore), 169-184
[13] Zaki, M. J.; Hsiao, C.-J., Efficient algorithms for mining closed itemsets and their lattice structure, IEEE Trans. Knowl. Data Eng., 17, 4, 462-478 (2005)
[14] Fan, H.; Ramamohanarao, K., Fast discovery and the generalization of strong jumping emerging patterns for building compact and accurate classifiers, IEEE Trans. Knowl. Data Eng., 18, 6, 721-737 (2006)
[15] L. De la Vega-Doria, J.A. Carrasco Ochoa, J. Ruiz-Shulcloper, Fuzzy kora-omega algorithm, in: Proceedings of Sixth European Congress on Intelligent Techniques and Soft Computing, Aachen, Germany, 1998, pp. 1190-1194.; L. De la Vega-Doria, J.A. Carrasco Ochoa, J. Ruiz-Shulcloper, Fuzzy kora-omega algorithm, in: Proceedings of Sixth European Congress on Intelligent Techniques and Soft Computing, Aachen, Germany, 1998, pp. 1190-1194.
[16] Li, J.; Dong, G.; Ramamohanarao, K., Instance-based classification by emerging patterns, (Zighed, D. A.; Komorowski, H. J.; Zytkow, J. M., Proceedings of the Fourth European Conference on Principles and Practice of Knowledge Discovery in Databases, (2000), Springer-Verlag: Springer-Verlag Lyon, France), 191-200
[17] Bailey, J.; Manoukian, T.; Ramamohanarao, K., Fast algorithms for mining emerging patterns, (in: Proceedings of the Sixth European Conference on Principles of Data Mining and Knowledge Discovery, Helsinki, Finland, Lecture Notes in Computer Sciences, vol. 2431 (2002), Springer-Verlag), 187-208
[18] Kobylinski, L.; Walczak, K., Jumping emerging patterns with occurrence count in image classification, (Washio, T., Proceedings of the 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2008, Osaka, Japan, Lecture Notes in Artificial Intelligence, vol. 5012 (2008), Springer-Verlag: Springer-Verlag London, UK), 904-909
[19] A. Keilis-Borok, A. Soloviov, Pattern recognition: general description, in: Workshop in Non-linear Dynamics and Earthquake Prediction, International Center for Science and High Technology, Trieste, Italy, 1991, pp. 1-14.; A. Keilis-Borok, A. Soloviov, Pattern recognition: general description, in: Workshop in Non-linear Dynamics and Earthquake Prediction, International Center for Science and High Technology, Trieste, Italy, 1991, pp. 1-14.
[20] C. Merz, P. Murphy, Uci repository of machine learning databases, Technical Report, University of California at Irvine, Department of Information and Computer Science, 1998.; C. Merz, P. Murphy, Uci repository of machine learning databases, Technical Report, University of California at Irvine, Department of Information and Computer Science, 1998.
[21] P. Terlecki, K. Walczak, Adaptive classification with jumping emerging patterns, in: G. Wang (Ed.), RSKT 2008, Lecture Notes in Artificial Inteligence, vol. 5009, 2008, pp. 39-46.; P. Terlecki, K. Walczak, Adaptive classification with jumping emerging patterns, in: G. Wang (Ed.), RSKT 2008, Lecture Notes in Artificial Inteligence, vol. 5009, 2008, pp. 39-46. · Zbl 1185.68576
[22] Michalski, R. S.; Stepp, R., Revealing conceptual structure in data by inductive inference, (Michie, D.; Hayes, J. E.; Pao, H. H., Machine Intelligence, vol. 10 (1982), Ellis Horwood Ltd.: Ellis Horwood Ltd. New York), 173-196
[23] Duda, R. O.; Hart, P. E.; Stork, D. G., Pattern Classification (2000), Wiley-Interscience: Wiley-Interscience New York, NY
[24] Kuncheva, L. I., Combining Pattern Classifiers. Methods and Algorithms (2004), Wiley-Interscience: Wiley-Interscience Hoboken, New Jersey · Zbl 1066.68114
[25] I. Wittn, E. Frank, L. Trigg, M. Hall, G. Holmes, S. Cunnigham, Weka: practical machine learning tools and techniques with java implementations, in: Emerging Knowledge Engineering and Connectionist-based Information Systems, 1999, pp. 192-196.; I. Wittn, E. Frank, L. Trigg, M. Hall, G. Holmes, S. Cunnigham, Weka: practical machine learning tools and techniques with java implementations, in: Emerging Knowledge Engineering and Connectionist-based Information Systems, 1999, pp. 192-196.
[26] Dietterich, T. G., Approximate statistical tests for comparing supervised classification learning algorithms, Neural Comput., 10, 7, 1895-1923 (1998)
[27] P. Terlecki, K. Walczak, Efficient discovery of top-k minimal jumping emerging patterns, in: C. Chang (Ed.), RSCTC, Lecture Notes in Artificial Intelligence, vol. 5306, 2008, pp. 438-447.; P. Terlecki, K. Walczak, Efficient discovery of top-k minimal jumping emerging patterns, in: C. Chang (Ed.), RSCTC, Lecture Notes in Artificial Intelligence, vol. 5306, 2008, pp. 438-447. · Zbl 1185.68576
[28] U. Fayyad, K. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, in: Proceedings of 13th International Joint Conference on Artificial Intelligence (IJCAI), 1993, pp. 1022-1029.; U. Fayyad, K. Irani, Multi-interval discretization of continuous-valued attributes for classification learning, in: Proceedings of 13th International Joint Conference on Artificial Intelligence (IJCAI), 1993, pp. 1022-1029.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.