×

Rank discrimination measures for enforcing monotonicity in decision tree induction. (English) Zbl 1355.68232

Summary: Monotone classification is a relatively recent topic in machine learning in which the classification function to learn is asked to guarantee a sort of monotonicity of the class with respect to attribute values. Nevertheless, real datasets are quite far from being monotone and this can sharply limit the performance of purely monotone classifiers while standard classifiers are simply insensitive to monotonicity. Here we focus on rank discrimination measures to be used in decision tree induction, i.e., functions able to measure the discrimination power of an attribute with respect to the class taking into account the monotonicity of the class with respect to the attribute. Three new measures are studied in detail and a hierarchical construction model is derived allowing the formal definition of a general rank discrimination measure. Our measures have been compared with other well-known proposals, quantifying both the accuracy and the monotonicity of the resulting binary decision tree classifiers.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

UCI-ml; C4.5
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aczél, J.; Daróczy, Z., On measures of information and their characterizations, (Math. Sci. Eng., 115 (1975), Academic Press: Academic Press New York/San Francisco/London) · Zbl 0345.94022
[2] Ben-David, A., Monotonicity maintenance in information-theoretic machine learning algorithms, Machine Learn., 19, 29-43 (1995)
[3] Ben-David, A.; Sterling, L.; Pao, Y. H., Learning and classification of monotonic ordinal concepts, Comput. Intell., 5, 1, 45-49 (1989)
[4] Ben-David, A.; Sterling, L.; Tran, T., Adding monotonicity to learning algorithms may impair their accuracy, Expert Syst. Appl., 36, 3, Part 2, 6627-6634 (2009)
[6] Breiman, L.; Friedman, J.; Stone, C. J.; Olshen, R. A., Classification and Regression Trees (1984), Chapman and Hall/CRC: Chapman and Hall/CRC Boca Raton · Zbl 0541.62042
[8] Cao-Van, K.; De Baets, B., Consistent representation of rankings, (de Swart, H.; Orlowska, E.; Schmidt, G.; Roubens, M., Theory and Applications of Relational Structures as Knowledge Instruments. Theory and Applications of Relational Structures as Knowledge Instruments, Lecture Notes in Computer Science, vol. 2929 (2003), Springer: Springer Berlin/Heidelberg), 1966-1967 · Zbl 1029.00017
[9] Cao-Van, K.; De Baets, B., Growing decision trees in an ordinal setting, Int. J. Intell. Syst., 18, 7, 733-750 (2003) · Zbl 1048.68066
[10] Cieslak, D. A.; Chawla, N. V., Learning decision trees for unbalanced data, (Daelemans, W.; Goethals, B.; Morik, K., Machine Learning and Knowledge Discovery in Databases. Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, vol. 5211 (2008), Springer: Springer Berlin, Heidelberg), 241-256
[11] Demšar, J., Statistical comparisons of classifiers over multiple data sets, J. Machine Learning Res., 7, 1-30 (2006) · Zbl 1222.68184
[14] Greco, S.; Matarazzo, B.; Slowinski, R., Rough approximation by dominance relations, Int. J. Intell. Syst., 17, 2, 153-171 (2002) · Zbl 0997.68135
[15] Greco, S.; Matarazzo, B.; Slowinski, R., Rough sets methodology for sorting problems in presence of multiple attributes and criteria, Eur. J. Oper. Res., 138, 2, 247-259 (2002) · Zbl 1008.90509
[16] Greco, S.; Matarazzo, B.; Slowinski, R., Customer satisfaction analysis based on rough set approach, Zeitschrift für Betriebswirtschaft, 77, 325-339 (2007)
[17] Hu, Q.; Che, X.; Zhang, L.; Zhang, D.; Guo, M.; Yu, D., Rank entropy based decision trees for monotonic classification, IEEE Trans. Knowl. Data Eng., 24, 11, 2052-2064 (2012)
[18] Hu, Q.; Guo, M.; Yu, D.; Liu, J., Information entropy for ordinal classification, SCIENCE CHINA Inform. Sci., 53, 1188-1200 (2010) · Zbl 1497.62023
[19] Hu, Q.; Pan, W.; Zhang, L.; Zhang, D.; Song, Y.; Guo, M.; Yu, D., Feature selection for monotonic classification, IEEE Trans. Fuzzy Syst., 20, 1, 69-81 (2012)
[20] Hüllermeier, E.; Vanderlooy, S., Why fuzzy decision trees are good rankers, IEEE Trans. Fuzzy Syst., 17, 6, 1233-1244 (2009)
[22] Lievens, S.; De Baets, B., Supervised ranking in the WEKA environment, Inform. Sci., 180, 24, 4763-4771 (2010) · Zbl 1206.68257
[23] Makino, K.; Suda, T.; Ono, H.; Ibaraki, T., Data analysis by positive decision trees, IEICE Transactions on Information and Systems, E82-D, 1, 76-88 (1999)
[27] Marsala, C.; Petturiti, D., Hierarchical model for rank discrimination measures, (van der Gaag, L. C., Symbolic and Quantitative Approaches to Reasoning with Uncertainty. Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Lecture Notes in Computer Science, vol. 7958 (2013), Springer: Springer Berlin, Heidelberg), 412-423 · Zbl 1390.68546
[28] Milstein, I.; Ben-David, A.; Potharst, R., Generating noisy ordinal monotone datasets, Artif. Intell. Res., 3, 1 (2014)
[29] Potharst, R.; Bioch, J. C., Decision trees for ordinal classification, Intell. Data Anal., 4, 2, 97-111 (2000) · Zbl 1088.68740
[30] Potharst, R.; Feelders, A. J., Classification trees for problems with monotonicity constraints, SIGKDD Explor. Newslett., 4, 1, 1-10 (2002)
[31] Quinlan, J. R., C4.5: Programs for Machine Learning (1993), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA
[32] Rademaker, M.; De Baets, B., Optimal restoration of stochastic monotonicity with respect to cumulative label frequency loss functions, Inform. Sci., 181, 4, 747-757 (2011)
[34] van der Kamp, R.; Feelders, A.; Barile, N., Isotonic classification trees, (Adams, N. M.; Robardet, C.; Siebes, A.; Boulicaut, J.-F., Advances in Intelligent Data Analysis VIII. Advances in Intelligent Data Analysis VIII, Lecture Notes in Computer Science, vol. 5772 (2009), Springer: Springer Berlin, Heidelberg), 405-416
[35] Velikova, M.; Daniels, H., Decision trees for monotone price models, Comput. Manage. Sci., 1, 231-244 (2004) · Zbl 1115.91314
[36] Velikova, M.; Daniels, H.; Samulski, M., Partially monotone networks applied to breast cancer detection on mammograms, (Kurková, V.; Neruda, R.; Koutník, J., Artificial Neural Networks - ICANN 2008. Artificial Neural Networks - ICANN 2008, Lecture Notes in Computer Science, vol. 5163 (2008), Springer: Springer Berlin/Heidelberg), 917-926
[38] Xia, F.; Zhang, W.; Li, F.; Yang, Y., Ranking with decision tree, Knowl. Inform. Syst., 17, 3, 381-395 (2008)
[39] Yuan, Y.; Shaw, M. J., Induction of fuzzy decision trees, Fuzzy Sets Syst., 69, 2, 125-139 (1995)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.