×

Learning decision trees with taxonomy of propositionalized attributes. (English) Zbl 1159.68543

Summary: We consider the problem of exploiting a taxonomy of propositionalized attributes in order to learn compact and robust classifiers. We introduce Propositionalized Attribute Taxonomy Guided Decision Tree Learner (PAT-DTL), an inductive learning algorithm that exploits a taxonomy of propositionalized attributes as prior knowledge to generate compact decision trees. Since taxonomies are unavailable in most domains, we also introduce propositionalized Attribute Taxonomy Learner (PAT-Learner) that automatically constructs taxonomy from data. PAT-DTL uses top-down and bottom-up search to find a locally optimal cut that corresponds to the literals of decision rules from data and propositionalized attribute taxonomy. PAT-Learner propositionalizes attributes and hierarchically clusters the propositionalized attributes based on the distribution of class labels that co-occur with them to generate a taxonomy. Our experimental results on UCI repository data sets show that the proposed algorithms can generate a decision tree that is generally more compact than and is sometimes comparably accurate to those produced by standard decision tree learners.

MSC:

68T10 Pattern recognition, speech recognition

Software:

UCI-ml; C4.5
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Ashburner, M.; Ball, C.; Blake, J.; Botstein, D.; Butler, H.; Cherry, J.; Davis, A.; Dolinski, K.; Dwight, S.; Eppig, J.; Harris, M.; Hill, D.; Issel-Tarver, L.; Kasarskis, A.; Lewis, S.; Matese, J.; Richardson, J.; Ringwald, M.; Rubin, G.; Sherlock, G., Gene ontology: tool for the unification of biology. The gene ontology consortium, Nat. Genet., 25, 1, 25-29 (2000)
[5] Kohavi, R.; Provost, F., Applications of data mining to electronic commerce, Data Min. Knowl. Discovery, 5, 1-2, 5-10 (2001) · Zbl 1006.68626
[10] Quinlan, J. R., C4.5: Programs for Machine Learning (1993), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA
[11] Haussler, D., Quantifying inductive bias: AI learning algorithms and Valiant’s learning framework, Artif. Intell., 36, 177-221 (1988) · Zbl 0651.68104
[12] Arndt, C., Information Measures (2001), Springer Telos · Zbl 0973.94001
[14] Slonim, N.; Friedman, N.; Tishby, N., Multivariate information bottleneck, Neural Comput., 18, 8, 1739-1789 (2006) · Zbl 1125.68042
[15] Kullback, S.; Leibler, R. A., On information and sufficiency, Ann. Math. Stat., 22, 79-86 (1951) · Zbl 0042.38403
[17] Kerridge, D. F., Inaccuracy and inference, J. R. Stat. Soc. Ser. B, 23, 184-194 (1961) · Zbl 0112.10302
[18] Sibson, R., Information radius, Z. Wahrs. Verw. Geb., 14, 149-160 (1969) · Zbl 0186.53301
[19] Burbea, J.; Rao, C. R., Entropy differential metric, distance and divergence measures in probability spaces: a unified approach, J. Multivariate Anal., 12, 575-596 (1982) · Zbl 0526.60015
[20] Burbea, J.; Rao, C. R., On the convexity of some divergence measures based on entropy functions, IEEE Trans. Inf. Theory, IT-28, 489-495 (1982) · Zbl 0479.94009
[21] Topsøe, F., Some inequalities for information divergence and related measures of discrimination, IEEE Trans. Inf. Theory, 46, 1602-1609 (2000) · Zbl 1003.94010
[22] Taneja, I., New developments in generalized information measures, Adv. Imaging Electron Phys., 91, 37-135 (1995)
[27] Han, J.; Fu, Y., Exploration of the power of attribute-oriented induction in data mining, (Fayyad, U. M.; Piatetsky-Shapiro, G.; Smyth, P.; Uthurusamy, R., Advances in Knowledge Discovery and Data Mining (1996), AIII Press, MIT Press: AIII Press, MIT Press Cambridge, MA)
[30] Gibson, D.; Kleinberg, J.; Raghavan, P., Clustering categorical data: an approach based on dynamical systems, VLDB J. Very Large Data Bases, 8, 3-4, 222-236 (1998)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.