×

Mining competent case bases for case-based reasoning. (English) Zbl 1168.68487

Summary: Case-based reasoning relies heavily on the availability of a highly competent case base to make high-quality decisions. However, good case bases are difficult to come by. In this paper, we present a novel algorithm for automatically mining a high-quality case base from a raw case set that can preserve and sometimes even improve the competence of case-based reasoning. In this paper, we analyze two major problems in previous case-mining algorithms. The first problem is caused by noisy cases such that the nearest neighbor cases of a problem may not provide correct solutions. The second problem is caused by uneven case distribution, such that similar problems may have dissimilar solutions. To solve these problems, we develop a theoretical framework for the error bound in case-based reasoning, and propose a novel case-base mining algorithm guided by the theoretical results that returns a high-quality case base from raw data efficiently. We support our theory and algorithm with extensive empirical evaluation using different benchmark data sets.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] A. Aamodt, H.A. Sandtorv, O.M. Winnem, Combining case based reasoning and data mining—a way of revealing and reusing rams experience, in: Proceedings of the International Conference on Safety and Reliability, ESREL’98, 1998, pp. 1345-1351; A. Aamodt, H.A. Sandtorv, O.M. Winnem, Combining case based reasoning and data mining—a way of revealing and reusing rams experience, in: Proceedings of the International Conference on Safety and Reliability, ESREL’98, 1998, pp. 1345-1351
[2] Aha, D. W., Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms, International Journal of Man-Machine Studies, 36, 267-287 (1992)
[3] Aha, D. W.; Kibler, D. F.; Albert, M. K., Instance-based learning algorithms, Machine Learning, 6, 1, 37-66 (1991)
[4] Bach, F. R.; Jordan, M. I., Kernel independent component analysis, Journal of Machine Learning Research, 3, 1-48 (2002) · Zbl 1088.68689
[5] Blake, C. L.; Merz, C. J., UCI repository of machine learning databases (1998), University of California, Irvine, Department of Information and Computer Sciences
[6] Bonzano, A.; Cunningham, P.; Smyth, B., Using introspective learning to improve retrieval in CBR: A case study in air traffic control, (Proceedings of the 2nd International Conference on Case-Based Reasoning (ICCBR-97). Proceedings of the 2nd International Conference on Case-Based Reasoning (ICCBR-97), July 25-27 1997. Proceedings of the 2nd International Conference on Case-Based Reasoning (ICCBR-97). Proceedings of the 2nd International Conference on Case-Based Reasoning (ICCBR-97), July 25-27 1997, Lecture Notes Artif. Intell. (1997), Springer), 291-302
[7] Brighton, H.; Mellish, C., On the consistency of information filters for lazy learning algorithms, (Principles of Data Mining and Knowledge Discovery, Third European Conference, PKDD ’99, Proceedings. Principles of Data Mining and Knowledge Discovery, Third European Conference, PKDD ’99, Proceedings, Prague, Czech Republic, September 15-18, 1999 (1999), Springer)
[8] C.E. Brodley, Automatic algorithm/model class selection, in: ICML, 1993, pp. 17-24; C.E. Brodley, Automatic algorithm/model class selection, in: ICML, 1993, pp. 17-24
[9] R.M. Cameron-Jones, Instance selection by encoding length heuristic with random mutation hill climbing, in: Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, 1995, pp. 99-106; R.M. Cameron-Jones, Instance selection by encoding length heuristic with random mutation hill climbing, in: Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, 1995, pp. 99-106
[10] Chang, C.-L., Finding prototypes for nearest neighbor classifiers, IEEE Transactions on Computers, 23, 1179-1184 (1974) · Zbl 0292.68044
[11] Delany, S. J.; Cunningham, P., An analysis of case-base editing in a spam filtering system, (Proceedings of the European Case-based Reasoning Conference (ECCBR 2004) (2004), Springer), 128-141
[12] S.J. Delany, P. Cunningham, B. Smyth. Ecue: A spam filter that uses machine leaming to track concept drift, in: ECAI, 2006, p. 627; S.J. Delany, P. Cunningham, B. Smyth. Ecue: A spam filter that uses machine leaming to track concept drift, in: ECAI, 2006, p. 627
[13] P. Domingos, Rule induction and instance-based learning: A unified approach, in: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI 95, Montréal, Québec, Canada, 1995, pp. 1226-1232; P. Domingos, Rule induction and instance-based learning: A unified approach, in: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI 95, Montréal, Québec, Canada, 1995, pp. 1226-1232
[14] Duda, R. O.; Hart, P. E.; Stork, D. G., Pattern Classification (2000), Wiley-Interscience
[15] Fisher, R. A., The use of multiple measurements in taxonomic problems, Annals of Eugenics, 7, 179-188 (1936)
[16] Fyfe, C.; Corchado, J. M., Automating the construction of cbr systems using kernel methods, International Journal of Intelligent Systems, 16, 4, 571-586 (2001) · Zbl 0990.68594
[17] Garey, M. R.; Johnson, D. S., Computers and Intractability, A Guide to the Theory of NP-Completeness (1979), W.H. Freeman · Zbl 0411.68039
[18] Gates, G. W., The reduced nearest neighbor rule, IEEE Transactions on Information Theory, 18, 431-433 (1972)
[19] Hart, P. E., The condensed nearest neighbor rule, IEEE Transactions on Information Theory, 14, 515-516 (1968)
[20] Hastie, T.; Buja, A.; Tibshirani, R., Penalized discriminant analysis, The Annals of Statistics, 23, 73-102 (1995) · Zbl 0821.62031
[21] Hastie, T.; Tibshirani, R.; Buja, A., Flexible discriminant analysis by optimal scoring, Journal of the American Statistical Association, 89, 1255-1270 (1994) · Zbl 0812.62067
[22] D.F. Kibler, D.W. Aha, Learning representative exemplars of concepts: an initial case study, in: P. Langley (Ed.), Proceedings of the Fourth International Workshop on Machine Learning, Irvine, 1987, Palo Alto, CA, 1987, pp. 24-30, MK; D.F. Kibler, D.W. Aha, Learning representative exemplars of concepts: an initial case study, in: P. Langley (Ed.), Proceedings of the Fourth International Workshop on Machine Learning, Irvine, 1987, Palo Alto, CA, 1987, pp. 24-30, MK
[23] Kolodner, J., Case-Based Reasoning (1993), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA
[24] (Leake, D. B., Case-Based Reasoning: Experiences, Lessons, and Future Directions (1996), AAAI Press: AAAI Press Menlo Park, CA)
[25] E. McKenna, B. Smyth, Competence-guided case-base editing techniques, in: Proceedings of Advances in Case-Based Reasoning, 5th European Workshop, EWCBR 2000, Trento, Italy, 2000, pp. 186-197; E. McKenna, B. Smyth, Competence-guided case-base editing techniques, in: Proceedings of Advances in Case-Based Reasoning, 5th European Workshop, EWCBR 2000, Trento, Italy, 2000, pp. 186-197
[26] E. McKenna, B. Smyth, Competence-guided editing methods for lazy learning, in: ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence, Berlin, Germany, 2000, pp. 60-64; E. McKenna, B. Smyth, Competence-guided editing methods for lazy learning, in: ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence, Berlin, Germany, 2000, pp. 60-64
[27] McSherry, D., Automating case selection in the construction of a case library, Knowledge-Based Systems, 13, 2-3, 133-140 (2000)
[28] Mika, S., Kernel Fisher discriminants (2002), PhD thesis, University of Technology, Berlin · Zbl 1149.68413
[29] Mika, S.; Rätsch, G.; Weston, J.; Schölkopf, B.; Müller, K.-R., Fisher discriminant analysis with kernels, (Hu, Y.-H.; Larsen, J.; Wilson, E.; Douglas, S., Neural Networks for Signal Processing IX (1999), IEEE), 41-48
[30] R. Pan, Q. Yang, L. Li, Case retrieval using nonlinear feature-space transformation, in: Advances in Case-Based Reasoning, 7th European Conference (ECCBR-04), 2004, pp. 361-374; R. Pan, Q. Yang, L. Li, Case retrieval using nonlinear feature-space transformation, in: Advances in Case-Based Reasoning, 7th European Conference (ECCBR-04), 2004, pp. 361-374
[31] R. Pan, Q. Yang, J. Junfeng Pan, L. Li, Competence driven case-base mining, in: Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, Pittsburgh, Pennsylvania, USA, 2005, pp. 228-233; R. Pan, Q. Yang, J. Junfeng Pan, L. Li, Competence driven case-base mining, in: Proceedings, The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, Pittsburgh, Pennsylvania, USA, 2005, pp. 228-233
[32] D.W. Patterson, N. Rooney, M. Galushka, S.S. Anand, Towards dynamic maintenance of retrieval knowledge in cbr, in: Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference, 2002, pp. 126-131; D.W. Patterson, N. Rooney, M. Galushka, S.S. Anand, Towards dynamic maintenance of retrieval knowledge in cbr, in: Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society Conference, 2002, pp. 126-131
[33] Rasmussen, C. E.; Neal, R. M.; Hinton, G.; van Camp, D.; Revow, M.; Ghahramani, Z.; Kustra, R.; Tibshirani, R., Delve data for evaluating learning in valid experiments (2003)
[34] Ritter, G. L.; Woodruff, H. B.; Lowry, S. R.; Isenhour, T. L., An algorithm for a selective nearest neighbor decision rule, IEEE Transactions on Information Theory, 21, 665-669 (1975) · Zbl 0323.68023
[35] Roth, V.; Steinhage, V., Nonlinear discriminant analysis using kernel functions, (Solla, S. A.; Leen, T. K.; Müller, K.-R., Advances in Neural Information Processing Systems, vol. 12 (1999), MIT Press), 568-574
[36] K. Saadi, N.L.C. Talbot, G.C. Cawley, Optimally regularised kernel fisher discriminant analysis, in: 17th International Conference on Pattern Recognition (ICPR (2)), 2004, pp. 427-430; K. Saadi, N.L.C. Talbot, G.C. Cawley, Optimally regularised kernel fisher discriminant analysis, in: 17th International Conference on Pattern Recognition (ICPR (2)), 2004, pp. 427-430 · Zbl 1124.68096
[37] Salzberg, S., A nearest hyperrectangle learning method, Machine Learning, 6, 251-276 (1991)
[38] Schölkopf, B.; Smola, A. J.; Müller, K.-R., Nonlinear component analysis as a kernel eigenvalue problem, Neural Computation, 10, 5, 1299-1319 (1998)
[39] Skalak, D. B., Prototype and feature selection by sampling and random mutation hill climbing algorithms, (International Conference on Machine Learning (1994), Morgan Kaufmann), 293-301
[40] Smyth, B.; Keane, M. T., Remembering to forget: A competence-preserving case deletion policy for case-based reasoning systems, (Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-95). Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-95), August 1995 (1995), Morgan Kaufmann: Morgan Kaufmann San Francisco, CA), 377-382
[41] Smyth, B.; Keane, M. T., Modelling the competence of case-bases, (Cunningham, P.; Smyth, B.; Keane, M., Proceedings of the Fourth European Workshop on Case-Based Reasoning (1998), Springer: Springer Berlin), 208-220
[42] Smyth, B.; McKenna, E., Building compact competent case-bases, (Proceedings of the Third International Conference on Case-Based Reasoning (1999), Springer: Springer Berlin), 329-342
[43] Smyth, B.; McKenna, E., Competence models and the maintenance problem, Computational Intelligence, 17, 2, 235-249 (2001)
[44] Tomek, I., An experiment with the edited nearest-neighbor rule, IEEE Transactions on Systems, Man, and Cybernetics, 6, 448-452 (1976) · Zbl 0332.68081
[45] Vapnik, V., The Nature of Statistical Learning Theory (1995), Springer: Springer New York · Zbl 0833.62008
[46] Watson, I., Applying Case-Based Reasoning: Techniques for Enterprise Systems (1997), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA · Zbl 0889.68136
[47] Wilson, D. L., Asymptotic properties of nearest neighbor rules using edited data, IEEE Transactions on Systems, Man, and Cybernetics, 2, 408-421 (1972) · Zbl 0276.62060
[48] D.R. Wilson, T.R. Martinez, Instance pruning techniques, in: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, 1997, pp. 403-411; D.R. Wilson, T.R. Martinez, Instance pruning techniques, in: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, 1997, pp. 403-411
[49] Wilson, D. R.; Martinez, T. R., Reduction techniques for instance-based learning algorithms, Machine Learning, 38, 3, 257-286 (2000) · Zbl 0954.68126
[50] Witten, I. H.; Frank, E., Data Mining: Practical Machine Learning Tools and Techniques (2005), Morgan Kaufmann: Morgan Kaufmann San Francisco, CA · Zbl 1076.68555
[51] Xiong, T.; Ye, J.; Li, Q.; Janardan, R.; Cherkassky, V., Efficient kernel discriminant analysis via qr decomposition, (Saul, L. K.; Weiss, Y.; Bottou, L., Advances in Neural Information Processing Systems, vol. 17 (2005), MIT Press: MIT Press Cambridge, MA), 1529-1536
[52] Yang, Q.; Zhu, J., A case-addition policy for case-base maintenance, Computational Intelligence, 17, 2, 250-262 (2001)
[53] Zhang, J., Selecting typical instances in instance-based learning, (Machine Learning: Proceedings of the Ninth International Conference ML92 (1992), Morgan Kaufmann), 470-479
[54] Zhu, J.; Yang, Q., Remembering to add: Competence-preserving case-addition policies for case base maintenance, (Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-99) (1999), Morgan Kaufmann), 234-241
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.