×

Consistency-based search in feature selection. (English) Zbl 1082.68791

Summary: Feature selection is an effective technique in dealing with dimensionality reduction. For classification, it is used to find an “optimal” subset of relevant features such that the overall accuracy of classification is increased while the data size is reduced and the comprehensibility is improved. Feature selection methods contain two important aspects: evaluation of a candidate feature subset and search through the feature space. Existing algorithms adopt various measures to evaluate the goodness of feature subsets. This work focuses on inconsistency measure according to which a feature subset is inconsistent if there exist at least two instances with same feature values but with different class labels. We compare inconsistency measure with other measures and study different search strategies such as exhaustive, complete, heuristic and random search, that can be applied to this measure. We conduct an empirical study to examine the pros and cons of these search methods, give some guidelines on choosing a search method, and compare the classifier error rates before and after feature selection.

MSC:

68T10 Pattern recognition, speech recognition
68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)

Software:

C4.5; SNNS
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Almuallim, H.; Dietterich, T. G., Learning boolean concepts in the presence of many irrelevant features, Artificial Intelligence, 69, 1-2, 279-305 (1994) · Zbl 0942.68657
[2] Bell, D. A.; Wang, H., A formalism for relevance and its application in feature subset selection, Machine Learning, 41, 175-195 (2000) · Zbl 0968.68072
[3] Ben-Bassat, M., Pattern recognition and reduction of dimensionality, (Krishnaiah, P. R.; Kanal, L. N., Handbook of Statistics (1982), North-Holland: North-Holland Amsterdam), 773-791
[4] Blake, C. L.; Merz, C. J., UCI repository of machine learning databases, 1998
[5] Blum, A. L.; Langley, P., Selection of relevant features and examples in machine learning, Artificial Intelligence, 97, 245-271 (1997) · Zbl 0904.68142
[6] Blumer, A.; Ehrenfeucht, A.; Haussler, D.; Warmuth, M. K., Occam’s razor, (Shavlik, J. W.; Dietterich, T. G., Readings in Machine Learning (1990), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA), 201-204 · Zbl 0653.68084
[7] Brassard, G.; Bratley, P., Fundamentals of Algorithms (1996), Prentice Hall: Prentice Hall Englewood Cliffs, NJ
[8] Cardie, C., Using decision trees to improve case-based learning, (Proceedings of Tenth International Conference on Machine Learning, Amherst, MA (1993)), 25-32
[9] Das, S., Filters, wrappers and a boosting-based hybrid for feature selection, (Proceedings of the Eighteenth International Conference on Machine Learning (ICML), Williamstown, MA (2001)), 74-81
[10] Dash, M., Feature selection via set cover, (Proceedings of IEEE Knowledge and Data Engineering Exchange Workshop, Newport, CA (1997), IEEE Computer Society), 165-171
[11] Dash, M.; Liu, H., Feature selection methods for classification, Intelligent Data Analysis: An Internat. J., 1, 3 (1997)
[12] Dash, M.; Liu, H., Hybrid search of feature subsets, (Proceedings of Pacific Rim International Conference on Artificial Intelligence (PRICAI-98), Singapore (1998)), 238-249
[13] Dash, M.; Liu, H.; Motoda, H., Consistency based feature selection, (Proceedings of Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Kyoto, Japan (2000)), 98-109
[14] Devijver, P. A.; Kittler, J., Pattern Recognition: A Statistical Approach (1982), Prentice Hall: Prentice Hall Englewood Cliffs, NJ · Zbl 0476.68057
[16] Hall, M. A., Correlation-based feature selection for discrete and numeric class machine learning, (Proceedings of Seventeenth International Conference on Machine Learning (ICML), Stanford, CA (2000), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA), 359-366
[17] Inza, I.; Larranaga, P.; Etxeberria, R.; Sierra, B., Feature subset selection by Bayesian network-based optimization, Artificial Intelligence, 123, 157-184 (2000) · Zbl 0952.68118
[18] John, G. H.; Kohavi, R.; Pfleger, K., Irrelevant features and the subset selection problem, (Proceedings of the Eleventh International Conference on Machine Learning, New Brunswick, NJ (1994)), 121-129
[19] Johnson, D. S., Approximation algorithms for combinatorial problems, J. Comput. System Sci., 9, 256-278 (1974) · Zbl 0296.65036
[20] Kira, K.; Rendell, L. A., The feature selection problem: Traditional methods and a new algorithm, (Proceedings of AAAI-92, San Jose, CA (1992)), 129-134
[22] Kohavi, R.; John, G. H., Wrappers for feature subset selection, Artificial Intelligence, 97, 1-2, 273-324 (1997) · Zbl 0904.68143
[23] Koller, D.; Sahami, M., Toward optimal feature selection, (Proceedings of International Conference on Machine Learning, Bari, Italy (1996)), 284-292
[24] Kononenko, I., Estimating attributes: Analysis and extension of RELIEF, (Proceedings of European Conference on Machine Learning, Catania, Italy (1994)), 171-182
[25] Liu, H.; Hussain, F.; Tan, C. L.; Dash, M., Discretization: An enabling technique, J. Data Mining Knowledge Discovery, 6, 4, 393-423 (2002)
[26] (Liu, H.; Motoda, H., Feature Extraction, Construction and Selection: A Data Mining Perspective (1998), Kluwer Academic: Kluwer Academic Boston, MA) · Zbl 0912.00012
[27] Liu, H.; Motoda, H., Feature Selection for Knowledge Discovery and Data Mining (1998), Kluwer Academic: Kluwer Academic Dordrecht · Zbl 0908.68127
[28] Liu, H.; Motoda, H.; Dash, M., A monotonic measure for optimal feature selection, (Proceedings of European Conference on Machine Learning, Chemnitz, Germany (1998)), 101-106
[29] Liu, H.; Setiono, R., Feature selection and classification—A probabilistic wrapper approach, (Proceedings of Ninth International Conference on Industrial and Engineering Applications of AI and ES (1996)), 419-424
[30] Liu, H.; Setiono, R., A probabilistic approach to feature selection—A filter solution, (Proceedings of International Conference on Machine Learning, Bari, Italy (1996)), 319-327
[31] Modrzejewski, M., Feature selection using rough sets theory, (Brazdil, P. B., Proceedings of the European Conference on Machine Learning, Vienna, Austria (1993)), 213-226
[32] Moore, A. W.; Lee, M. S., Efficient algorithms for minimizing cross validation error, (Proceedings of Eleventh International Conference on Machine Learning, New Brunswick, NJ (1994), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA), 190-198
[33] Mucciardi, A. N.; Gose, E. E., A comparison of seven techniques for choosing subsets of pattern recognition, IEEE Trans. Comput., C-20, 1023-1031 (1971) · Zbl 0222.68044
[34] Narendra, P. M.; Fukunaga, K., A branch and bound algorithm for feature selection, IEEE Trans. Comput., C-26, 9, 917-922 (1977) · Zbl 0363.68059
[35] Oliveira, A. L.; Vincentelli, A. S., Constructive induction using a non-greedy strategy for feature selection, (Proceedings of Ninth International Conference on Machine Learning, Aberdeen, Scotland (1992), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA), 355-360
[36] Quinlan, J. R., C4.5: Programs for Machine Learning (1993), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA
[38] Schlimmer, J. C., Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning, (Proceedings of Tenth International Conference on Machine Learning, Amherst, MA (1993)), 284-290
[39] Segen, J., Feature selection and constructive inference, (Proceedings of Seventh International Conference on Pattern Recognition, Montreal, QB (1984)), 1344-1346
[40] Sheinvald, J.; Dom, B.; Niblack, W., A modelling approach to feature selection, (Proceedings of Tenth International Conference on Pattern Recognition, Atlantic City, NJ, Vol. 1 (1990)), 535-539
[41] Siedlecki, W.; Sklansky, J., On automatic feature selection, Internat. J. Pattern Recognition Artificial Intelligence, 2, 197-220 (1988)
[42] Skalak, D. B., Prototype and feature selection by sampling and random mutation hill-climbing algorithms, (Proceedings of Eleventh International Conference on Machine Learning, New Brunswick, NJ (1994), Morgan Kaufmann: Morgan Kaufmann San Mateo, CA), 293-301
[43] Soucy, P.; Mineau, G. W., A simple feature selection method for text classification, (Proceedings of IJCAI-01, Seattle, WA (2001)), 897-903
[44] Vafaie, H.; Imam, I. F., Feature selection methods: Genetic algorithms vs. greedy-like search, (Proceedings of International Conference on Fuzzy and Intelligent Control Systems (1994))
[45] Watanabe, S., Pattern Recognition: Human and Mechanical (1985), Wiley Interscience: Wiley Interscience New York
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.