×

Feature selection using rough set-based direct dependency calculation by avoiding the positive region. (English) Zbl 1423.68512

Summary: Feature selection is the process of selecting a subset of features from the entire dataset such that the selected subset can be used on behalf of the entire dataset to reduce further processing. There are many approaches proposed for feature selection, and recently, rough set-based feature selection approaches have become dominant. The majority of such approaches use attribute dependency as criteria to determine the feature subsets. However, this measure uses the positive region to calculate dependency, which is a computationally expensive job, consequently effecting the performance of feature selection algorithms using this measure. In this paper, we have proposed a new heuristic-based dependency calculation method. The proposed method comprises a set of two rules called Direct Dependency Calculation (DDC) to calculate attribute dependency. Direct dependency calculates the number of unique/non-unique classes directly by using attribute values. Unique classes define accurate predictors of class, while non-unique classes are not accurate predictors. Calculating unique/non-unique classes in this manner lets us avoid the time-consuming calculation of the positive region, which helps increase the performance of subsequent algorithms. A two-dimensional grid was used as an intermediate data structure to calculate dependency. We have used the proposed method with a number of feature selection algorithms using various publically available datasets to justify the proposed method. A comparison framework was used for analysis purposes. Experimental results have shown the efficiency and effectiveness of the proposed method. It was determined that execution time was reduced by \(63\%\) for calculation of the dependency using DDCs, and a \(65\%\) decrease was observed in the case of feature selection algorithms based on DDCs. The required runtime memory was decreased by \(95\%\).

MSC:

68T37 Reasoning under uncertainty in the context of artificial intelligence

Software:

UCI-ml
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Dessì, N.; Pes, B., Similarity of feature selection methods: an empirical study across data intensive classification tasks, Expert Sys. Appl., 42, 10, 4632-4642 (2015)
[2] Hong, T. P.; Chen, C. H.; Lin, F. S., Using group genetic algorithm to improve performance of attribute clustering, Appl. Soft Comput., 29, 371-378 (2015)
[3] Paul, S.; Das, S., Simultaneous feature selection and weighting - an evolutionary multi-objective optimization approach, Pattern Recognit. Lett., 65, 51-59 (2015)
[4] Akande, K. O.; Owolabi, T. O.; Olatunji, S. O., Investigating the effect of correlation-based feature selection on the performance of support vector machines in reservoir characterization, J. Nat. Gas Sci. Eng., 22, 515-522 (2015)
[5] Koprinska, I.; Rana, M.; Agelidis, V. G., Correlation and instance based feature selection for electricity load forecasting, Know.-Based Sys., 82, 29-40 (2015)
[6] Qian, W.; Shu, W., Mutual information criterion for feature selection from incomplete data, Neurocomputi., 168, 210-220 (2015)
[7] Han, M.; Ren, W., Global mutual information-based feature selection approach using single-objective and multi-objective optimization, Neurocomput., 168, 47-54 (2015)
[8] Wei, M.; Chow, T. W.S.; Chan, R. H., Heterogeneous feature subset selection using mutual information-based feature transformation, Neurocomput., 168, 706-718 (2015)
[9] Dash, M.; Liu, H., Consistency-based search in feature selection, Artif. Intell., 151, 1, 155-176 (2003) · Zbl 1082.68791
[10] Moradi, P.; Rostami, M. A., Graph theoretic approach for unsupervised feature selection, Eng. Appl. Artif Intell., 44, 33-45 (2015)
[11] Moradi, P.; Rostami, M., Integration of graph clustering with ant colony optimization for feature selection, Knowl.-Based Sys., 84, 144-161 (2015)
[12] Bouhamed, S. A.; Kallel, I. K.; Masmoudi, D. S.; Solaiman, B., Feature selection in possibilistic modeling, Pattern Recognit., 48, 11, 3627-3640 (2015)
[13] Samb, M. L.; Camara, F.; Ndiaye, S.; Slimani, Y.; Esseghir, M. A., A novel RFE-SVM-based feature selection approach for classification, Int. J. Adv. Sci. Tech., 43, 27-36 (2012)
[14] Pawlak, Z.; Skowron, A., Rudiments of rough sets, Information sciences, 177, 1, 3-27 (2007) · Zbl 1142.68549
[15] Pawlak, Z., Rough sets, Int. J. Comp. Info. Sci., 11, 341-356 (1982) · Zbl 0501.68053
[16] Varma, P. R.K.; Kumari, V. V.; Kumar, S. S., A novel rough set attribute reduction based on ant colony optimization, Int. J. Intell. Sys. Tech. Appl., 14, 3-4, 330-353 (2015)
[17] Wang, C.; Shao, M.; Sun, B.; Hu, Q., An improved attribute reduction scheme with covering based rough sets, Appl. Soft Comp., 26, 235-243 (2015)
[18] Jia, X.; Shang, L.; Zhou, B.; Yao, Y., Generalized attribute reduct in rough set theory, Knowl.-Based Sys., 91, 204-218 (2016)
[19] Kusunoki, Y.; Inuiguchi, M., Structure-based attribute reduction: a rough set approach, Feat. Sel. Dat. Pattern Recognit., 113-160 (2015)
[20] Inbarani, H. H.; Azar, A. T.; Jothi, G., Supervised hybrid feature selection based on PSO and rough sets for medical diagnosis, Comput. Meth. Prog. Biomed., 113, 1, 175-185 (2014)
[21] Zuhtuogullari, K.; Allahvardi, N.; Arikan, N., Genetic algorithm and rough sets based hybrid approach for reduction of the input attributes in medical systems, Int. J. Innov. Comput. Info. Cont., 9, 3015-3037 (2013)
[22] Qian, W.; Shu, W.; Yang, B.; Zhang, C., An incremental algorithm to feature selection in decision systems with the variation of feature set, Chinese J. Elect., 24, 128-133 (2015)
[23] Chen, Y.; Zhu, Q.; Xu, H., Finding rough set reducts with fish swarm algorithm, Knowl.-Based Syst., 81, 22-29 (2015)
[24] Inbarani, H. H.; Bagyamathi, M.; Azar, A. T., A novel hybrid feature selection method based on rough set and improved harmony search, Neural Comput. Appl., 26, 8, 1859-1880 (2015)
[25] Podsiadło, M.; Rybiński, H., Rough sets in economy and finance, Trans. Rough Sets XVII, 109-173 (2014) · Zbl 1404.68171
[26] Prasad, V.; Rao, T. S.; Babu, M. S.P., Thyroid disease diagnosis via hybrid architecture composing rough data sets theory and machine learning algorithms, Soft Comput., 20, 3, 1179-1189 (2016)
[27] Xie, C. H.; Liu, Y. J.; Chang, J. Y., Medical image segmentation using rough set and local polynomial regression, Multimedia Tools Appl., 74, 6, 1885-1914 (2015)
[28] Montazer, G. A.; ArabYarmohammadi, S., Detection of phishing attacks in Iranian e-banking using a fuzzy-rough hybrid system, Appl. Soft Comput., 35, 482-492 (2015)
[29] Francisco, M. P.; Berna-Martinez, J. V.; Oliva, A. F.; Ortega, M. A.A., Algorithm for the detection of outliers based on the theory of rough sets, Decision Support Syst., 75, 63-75 (2015)
[30] Komorowski, J.; Pawlak, Z.; Polkowski, L.; Skowron, A., Rough Sets: A Tutorial, Rough Fuzzy Hybridization, A New Trend in Decision-Making, 3-98 (1999), Springer
[31] Jing, Y.; Li, T.; Huang, J.; Zhang, Y., An incremental attribute reduction approach based on knowledge granularity under the attribute generalization, Int. J. Approx. Reason., 76, 80-95 (2016) · Zbl 1385.68047
[32] Ge, H.; Li, L.; Xu, Y.; Yang, C., Quick general reduction algorithms for inconsistent decision tables, Int. J. Approx. Reason., 82, 56-80 (2017) · Zbl 1404.68162
[33] Raza, M. S.; Qamar, U., A hybrid feature selection approach based on heuristic and exhaustive algorithms using Rough set theory, (Proceedings of the International Conference on Internet of Things and Cloud Computing (2016), ACM)
[34] Qian, Y.; Wang, Q.; Cheng, H.; Liang, J.; Dang, C., Fuzzy-rough feature selection accelerator, Fuzzy Sets Syst., 258, 61-78 (2015) · Zbl 1335.68271
[35] Pacheco, F.; Cerrada, M.; Sánchez, R. V.; Cabrera, D.; Li, C.; Oliveira, J. V.D., Attribute clustering using rough set theory for feature selection in fault severity classification of rotating machinery, Expert Syst. Appl., 71, 69-86 (2017)
[36] Jiang, Y.; Yu, Y., Minimal attribute reduction with rough set based on compactness discernibility information tree, Soft Comput., 20, 6, 2233-2243 (2016)
[37] Tan, A.; Li, J.; Lin, Y.; Lin, G., Matrix-based set approximations and reductions in covering decision information systems, Int. J. Approx. Reason., 59, 68-80 (2015) · Zbl 1328.68230
[38] Zhang, X.; Mei, C.; Chen, D.; Li, J., Feature selection in mixed data: a method using a novel fuzzy rough set-based information entropy, Pattern Recognit., 56, 1-15 (2016) · Zbl 1412.68198
[39] Raza, M. S.; Qamar, U., An incremental dependency calculation technique for feature selection using rough sets, Inf. Sci., 343, 41-65 (2016)
[40] Shi, Y.; Eberhart, R., A modified particle swarm optimizer, (IEEE International Conference on Evolutionary Computation. IEEE International Conference on Evolutionary Computation, Anchorage, Alaska (1998)), 69-73
[41] Kevin, B.; Lichman, M., UCI Machine Learning Repository, Irvine, CA (“2017)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.