×

Inductive learning models with missing values. (English) Zbl 1135.68532

Summary: A new approach to working with missing attribute values in inductive learning algorithms is introduced. Three fundamental issues are studied: the splitting criterion, the allocation of values to missing attribute values, and the prediction of new observations. The formal definition for the splitting criterion is given. This definition takes into account the missing attribute values and generalizes the classical definition. In relation to the second objective, multiple values are assigned to missing attribute values using a decision theory approach. Each of these multiple values will have an associated confidence and error parameter. The error parameter measures how near or how far the value is from the original value of the attribute. After applying a splitting criterion, a decision tree is obtained (from training sets with or without missing attribute values). This decision tree can be used to predict the class of an observation (with or without missing attribute values). Hence, there are four perspectives. The three perspectives with missing attribute values are studied and experimental results are presented.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62C05 General considerations in statistical decision theory
62-07 Data analysis (statistics) (MSC2010)

Software:

UCI-ml; C4.5
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Little, R. J.A.; Rubin, D. B., Statistical Analysis with Missing Data (1987), John Wiley & Sons Inc · Zbl 0665.62004
[2] Mitchell, T. M., Machine Learning (1997), MacGraw-Hill · Zbl 0913.68167
[3] Kryszkiewicz, M., Rough set approach to incomplete information systems, Inform. Sci., 112, 39-49 (1998) · Zbl 0951.68548
[4] Kryszkiewicz, M., Rules in incomplete information systems, Inform. Sci., 113, 271-292 (1999) · Zbl 0948.68214
[5] Kryszkiewicz, M., Rough set approach to rules generation from incomplete information systems, Encyclopedia Comput. Sci. Technol., 44, 319-346 (2001)
[6] J.W. Grzymala-Busse, Rough set strategies to data with missing attribute values, in: Proceedings of the Workshop on Foundations and New Directions in Data Mining, associated with the third IEEE International Conference on Data Mining, 19-22 November, Melbourne, FL, USA, 2003, pp. 56-63; J.W. Grzymala-Busse, Rough set strategies to data with missing attribute values, in: Proceedings of the Workshop on Foundations and New Directions in Data Mining, associated with the third IEEE International Conference on Data Mining, 19-22 November, Melbourne, FL, USA, 2003, pp. 56-63
[7] J.W. Grzymala-Busse, S. Siddhaye, Rough set approaches to rule induction from incomplete data, in: Proceedings of the IPMU’2004, the 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, 4-9 July, Perugia, Italy, vol. 2, 2004, pp. 923-930; J.W. Grzymala-Busse, S. Siddhaye, Rough set approaches to rule induction from incomplete data, in: Proceedings of the IPMU’2004, the 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, 4-9 July, Perugia, Italy, vol. 2, 2004, pp. 923-930
[8] J.W. Grzymala-Busse, Three approaches to missing attribute values. A rough set perspective, in: Workshop on Foundations of Data Mining, associated with the fourth IEEE International Conference on Data Mining, 1-4 November, Brighton, UK, 2004; J.W. Grzymala-Busse, Three approaches to missing attribute values. A rough set perspective, in: Workshop on Foundations of Data Mining, associated with the fourth IEEE International Conference on Data Mining, 1-4 November, Brighton, UK, 2004
[9] Grzymala-Busse, J. W.; Hu, M., A comparison of several approaches to missing values in data mining, (Ziarko, W.; Yao, Y. Y., Rough Sets and Current Trends in Computing. Rough Sets and Current Trends in Computing, Lecture Notes in Computer Science, vol. 2005 (2001), Springer) · Zbl 1014.68558
[10] Hu, M.; Salvucci, S. M.; Cohen, M. P., Evaluation of some popular imputation algorithms, (Section on Survey Research Methods (2000), American Statistical Association)
[11] Scheffer, J., Dealing with missing data, Res. Lett. Inf. Math. Sci., 3, 153-160 (2002)
[12] Schafer, J. L., Analysis of Incomplete Multivariate Data (1997), Chapman & Hall: Chapman & Hall London · Zbl 0997.62510
[13] W. Liu, A. White, S. Thompson, M. Bramer, Techniques for dealing with missing values in classification, in: International Symposium on Intelligent Data Analysis, 1997; W. Liu, A. White, S. Thompson, M. Bramer, Techniques for dealing with missing values in classification, in: International Symposium on Intelligent Data Analysis, 1997
[14] Quinlan, J., Unknown attribute values in induction, (Proceedings of the Sixth International Machine Learning Workshop (1989), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA), 164-168
[15] Hunt, E. B.; Marin, J.; Stone, P. J., Experiments in Induction (1966), Academic Press: Academic Press New York
[16] Quinlan, J., Discovering rules by induction from large collections of examples, (Michie, D., Expert Systems in the Micro Electronic Age (1979))
[17] Quinlan, J., Learning efficient classification procedures, (Michlaski, R. S.; Carbonell, J. G.; Mitchell, T. M., Machine Learning: An Artificial Intelligence Approach (1983), Tioga Press: Tioga Press Palo Alto, CA)
[18] Quinlan, J., Induction of decision trees, Mach. Learn., 1, 81-106 (1986)
[19] Breiman, L.; Friedman, J. H.; Olshen, R. A.; Stone, C. J., Classification and Regression Trees (1984), Wadsworth: Wadsworth Belmont, CA · Zbl 0541.62042
[20] Quinlan, J., The effect of noise on concept learning, (Michlaski, R. S.; Carbonell, J. G.; Mitchell, T. M., Machine Learning: An Artificial Intelligence Approach, vol. 2 (1986), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA)
[21] Cestnik, B.; Kononenko, I., ASSISTANT 86: A knowledge-elicitation tool for sophisticated users, (Bratko, I.; Lavrac, N., Progress in Machine Learning (1987), Sigma Press: Sigma Press Wilmslow, UK)
[22] Quinlan, J., C4.5: Programs for Machine Learning (1992), Morgan Kaufmann: Morgan Kaufmann Los Gatos, CA
[23] Farhangfar, A.; Kurgan, L.; Pedrycz, W., Experimental analysis of methods for imputation of missing values in databases, (Priddy Kevin, L., Intelligent Computing: Theory and Applications II. Intelligent Computing: Theory and Applications II, Proceedings of the SPIE, vol. 5421 (2004)), 172-182
[24] Cios, K. J.; Kurgan, L. A., Hybrid inductive machine learning: An overview of CLIP algorithms, (Jain, L-C-; Kacprzyk, J., New Learning Paradigms in Soft Computing (2001), Physica-Verlag, Springer), 276-322 · Zbl 0987.68066
[25] Cios, K. J.; Kurgan, L. A., CLIP4: Hybrid inductive machine learning algorithm that generates inequality rules, Inform. Sci., 163, 1-3, 37-83 (2004)
[26] Duda, R. O.; Hart, P. E., Pattern Classification and Scene Analysis (1977), John Wiley
[27] Shannon, C. E., A Mathematical theory of communication, Bell Syst. Tech. J., 27, 379-423 (1948) · Zbl 1154.94303
[28] Zheng, Z.; Low, B. T., Classifying unseen cases with many missing values, (Zhong, N.; Zhou, L., PAKDD 99. PAKDD 99, LNAI, 1574 (1999)), 370-375
[29] I. Kononenko, I. Bratko, E. Rokar, Experiments in automatic learning of medical diagnostic rules, ISSEK Workshop, Bled, 1984; I. Kononenko, I. Bratko, E. Rokar, Experiments in automatic learning of medical diagnostic rules, ISSEK Workshop, Bled, 1984
[30] I. Fortes, R. Morales-Bueno, Ll. Mora, F. Triguero, A decision theory approach to work with missing attribute values in inductive learning algorithms, in: Proc. of COMPSTAT2000 (14th Conference of the International Association for Statistical Computing), Utrecht, 2000; I. Fortes, R. Morales-Bueno, Ll. Mora, F. Triguero, A decision theory approach to work with missing attribute values in inductive learning algorithms, in: Proc. of COMPSTAT2000 (14th Conference of the International Association for Statistical Computing), Utrecht, 2000
[31] G. Ramos, R. Morales, Formalizacion de los Algoritmos TDIDT y CIDIM, Techn. Report LCC-ITI 99/01, Dept. Computer Science, Malaga University, 1999; G. Ramos, R. Morales, Formalizacion de los Algoritmos TDIDT y CIDIM, Techn. Report LCC-ITI 99/01, Dept. Computer Science, Malaga University, 1999
[32] G. Ramos, R. Morales, A. Villalba, CIDIM. Control of Induction of Sample Division Method), Una mejora de los algoritmos TDIDT, Techn. Report LCC-ITI 97/08, Dept. Computer Science, Malaga University, 2000; G. Ramos, R. Morales, A. Villalba, CIDIM. Control of Induction of Sample Division Method), Una mejora de los algoritmos TDIDT, Techn. Report LCC-ITI 97/08, Dept. Computer Science, Malaga University, 2000
[33] Berger, J. O., Statistical Decision Theory and Bayesian Analysis (1988), Springer Verlag: Springer Verlag New York
[34] Hyalf, L.; Rivest, R. L., Constructing optimal binary decision trees is NP-complete, Inf. Process. Lett., 5, 1, 15-17 (1976) · Zbl 0333.68029
[35] Wilson, D. R.; Martinez, T. R., Improved heterogeneous distance functions, J. Artificial Intelligence Res., 6, 1, 1-34 (1997) · Zbl 0894.68118
[36] Friedman, J. H., A recursive partitioning decision rule for non-parametric classification, IEEE Trans. Comput., 404-408 (1977) · Zbl 0403.62036
[37] Holte, R. C., Very simple classification rules perform well on most commonly used datasets, Mach. Learn., 11, 63-91 (1993) · Zbl 0850.68278
[38] C. Blake, E. Keogh, C.J. Merz, UCI repository of machine learning databases, University of California, Irvine, Dept. of Information and Computer Sciences, http://www.ics.uci.edu/mlearn/MLRepository.html; C. Blake, E. Keogh, C.J. Merz, UCI repository of machine learning databases, University of California, Irvine, Dept. of Information and Computer Sciences, http://www.ics.uci.edu/mlearn/MLRepository.html
[39] G. Ramos, Nuevos Desarrollos en Aprendizaje Inductivo, Tesis Doctoral, Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, 2001; G. Ramos, Nuevos Desarrollos en Aprendizaje Inductivo, Tesis Doctoral, Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, 2001
[40] G. Ramos, R. Morales, A new method for induction decision trees by sampling, Neurocolt Workshop on Applications of Learning Theory Bellaterra, Barcelona, 2000; G. Ramos, R. Morales, A new method for induction decision trees by sampling, Neurocolt Workshop on Applications of Learning Theory Bellaterra, Barcelona, 2000
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.