×

Attribute bagging: Improving accuracy of classifier ensembles by using random feature subsets. (English) Zbl 1033.68092

Summary: We present Attribute Bagging (AB), a technique for improving the accuracy and stability of classifier ensembles induced using random subsets of features. AB is a wrapper method that can be used with any learning algorithm. It establishes an appropriate attribute subset size and then randomly selects subsets of features, creating projections of the training set on which the ensemble classifiers are built. The induced classifiers are then used for voting. This article compares the performance of our AB method with bagging and other algorithms on a hand-pose recognition dataset. It is shown that AB gives consistently better results than bagging, both in accuracy and stability. The performance of ensemble voting in bagging and the AB method as a function of the attribute subset size and the number of voters for both weighted and unweighted voting is tested and discussed. We also demonstrate that ranking the attribute subsets by their classification accuracy and voting using only the best subsets further improves the resulting performance of the ensemble.

MSC:

68T10 Pattern recognition, speech recognition

Software:

C4.5; DistAl
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Duda, R. O.; Hart, P. H.; Stork, D. G., Pattern Classification (2000), Wiley-Interscience: Wiley-Interscience New York
[2] Dietterich, T. G., Machine learning researchfour current directions, AI Magaz., 18, 97-136 (1997)
[3] Breiman, L., Bagging predictors, Mach. Learning, 24, 123-140 (1996) · Zbl 0858.68080
[4] J.R. Quinlan, Bagging, boosting, and C4.5, Proceedings of AAAI/IAAI (13th American Association for Artificial Intelligence National Conference on Artificial Intelligence, Portland, Oregon), Vol. 1, 1996, pp. 725-730.; J.R. Quinlan, Bagging, boosting, and C4.5, Proceedings of AAAI/IAAI (13th American Association for Artificial Intelligence National Conference on Artificial Intelligence, Portland, Oregon), Vol. 1, 1996, pp. 725-730.
[5] K. Tumer, J. Ghosh, Classifier combining: analytical results and implications, Working notes from the Workshop ‘Integrating Multiple Learned Models’, 13th National Conference on Artifical Intelligence, August 1996, Portland, Oregon.; K. Tumer, J. Ghosh, Classifier combining: analytical results and implications, Working notes from the Workshop ‘Integrating Multiple Learned Models’, 13th National Conference on Artifical Intelligence, August 1996, Portland, Oregon.
[6] K. Tumer, N.C. Oza, Decimated input ensembles for improved generalization, Proceedings of the International Joint Conference on Neural Networks, Washington, DC, 1999.; K. Tumer, N.C. Oza, Decimated input ensembles for improved generalization, Proceedings of the International Joint Conference on Neural Networks, Washington, DC, 1999. · Zbl 1035.68107
[7] P. Langley, Selection of relevant features in machine learning, Proceedings of the AAAI Fall Symposium on relevance, New Orleans, LA, AAAI Press, 1994.; P. Langley, Selection of relevant features in machine learning, Proceedings of the AAAI Fall Symposium on relevance, New Orleans, LA, AAAI Press, 1994.
[8] John, G. H.; Kohavi, R.; Pfleger, K., Irrelevant features and the subset selection problem, (Cohen, W. W.; Hirsh, H., Machine Learning: Proceedings of the 11th International Conference (1994), Morgan Kaufmann Publishers: Morgan Kaufmann Publishers San Francisco, CA), 121-129
[9] J. Yang, V. Honavar, Feature subset selection using a genetic algorithm, Genetic Programming 1997: Proceedings of the 2nd Annual Conference, Morgan Kaufmann, Stanford University, CA, 1997, pp. 380-386.; J. Yang, V. Honavar, Feature subset selection using a genetic algorithm, Genetic Programming 1997: Proceedings of the 2nd Annual Conference, Morgan Kaufmann, Stanford University, CA, 1997, pp. 380-386.
[10] M.A. Hall, L.A. Smith, Feature subset selection: a correlation based filter approach, Proceedings of the Fourth International Conference on Neural Information Processing and Intelligent Information Systems (ICONIP’97), New Zealand, Springer, Berlin, 1997, Vol. 2, pp. 855-858.; M.A. Hall, L.A. Smith, Feature subset selection: a correlation based filter approach, Proceedings of the Fourth International Conference on Neural Information Processing and Intelligent Information Systems (ICONIP’97), New Zealand, Springer, Berlin, 1997, Vol. 2, pp. 855-858.
[11] H. Liu, R. Setiono, A probabilistic approach to feature selection—a filter solution, Proceedings of the 13th International Conference on Machine Learning (ICML’96), Bari, Italy, 1996, pp. 319-327.; H. Liu, R. Setiono, A probabilistic approach to feature selection—a filter solution, Proceedings of the 13th International Conference on Machine Learning (ICML’96), Bari, Italy, 1996, pp. 319-327.
[12] Z. Zheng, G.I. Webb, Stochastic attribute selection committees, Proceedings of the Australian Joint Conference on Artificial Intelligence, Springer, Berlin, 1998, pp. 321-332.; Z. Zheng, G.I. Webb, Stochastic attribute selection committees, Proceedings of the Australian Joint Conference on Artificial Intelligence, Springer, Berlin, 1998, pp. 321-332.
[13] Z. Zheng, G.I. Webb, K.M. Ting, Integrating boosting and stochastic attribute selection committees for further improving the performance of decision tree learning, Proceedings of the 10th International Conference on Tools with Artificial Intelligence, Los Alamitos, CA, IEEE Computer Society Press, Silver Spring, MD, 1998, pp. 216-223.; Z. Zheng, G.I. Webb, K.M. Ting, Integrating boosting and stochastic attribute selection committees for further improving the performance of decision tree learning, Proceedings of the 10th International Conference on Tools with Artificial Intelligence, Los Alamitos, CA, IEEE Computer Society Press, Silver Spring, MD, 1998, pp. 216-223.
[14] S.D. Bay, Combining nearest neighbor classifiers through multiple feature subsets, Proceedings of the 17th International Conference on Machine Learning, Madison, WI, 1998, pp. 37-45.; S.D. Bay, Combining nearest neighbor classifiers through multiple feature subsets, Proceedings of the 17th International Conference on Machine Learning, Madison, WI, 1998, pp. 37-45.
[15] E.B. Kong, T.G. Dietterich, Error-correcting output coding corrects bias and variance, Proceedings of the Twelfth National Conference on Artificial Intelligence, Morgan Kauffman, San Francisco, CA, 1995, pp. 313-321.; E.B. Kong, T.G. Dietterich, Error-correcting output coding corrects bias and variance, Proceedings of the Twelfth National Conference on Artificial Intelligence, Morgan Kauffman, San Francisco, CA, 1995, pp. 313-321.
[16] Zhao, Meide; Quek, F.; Wu, Xindong, RIEVLrecursive induction learning in hand gesture recognition, IEEE Trans. Pattern Anal. Mach. Intell., 20, 1174-1185 (1998)
[17] Murthy, S. K.; Kasif, S.; Salzberg, S., A system for induction of oblique decision trees, J. Artif. Intell. Res., 2, 1-32 (1994) · Zbl 0900.68335
[18] Quinlan, J. R., Induction of decision trees, Mach. Learning, 1, 81-106 (1986)
[19] Quinlan, J. R., C4.5: Programs for Machine Learning (1993), Morgan Kaufmann: Morgan Kaufmann USA
[20] R. Boswell, Manual for NewID Version 6.1, Technical Report TI/P2154/RAB/4/2.5, The Turing Institute, Glasgow, 1990.; R. Boswell, Manual for NewID Version 6.1, Technical Report TI/P2154/RAB/4/2.5, The Turing Institute, Glasgow, 1990.
[21] Wu, X., The HCV induction algorithm, (Kwasny, S. C.; Buck, J. F., Proceedings of the 21st ACM Computer Science Conference (1993), ACM Press: ACM Press New York), 168-175
[22] Clark, P.; Niblett, T., The CN2 induction algorithm, Mach. Learning, 3, 261-283 (1989)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.