×

Optimization on the complementation procedure towards efficient implementation of the index generation function. (English) Zbl 1451.68272

Summary: In the era of big data, solutions are desired that would be capable of efficient data reduction. This paper presents a summary of research on an algorithm for complementation of a Boolean function which is fundamental for logic synthesis and data mining. Successively, the existing problems and their proposed solutions are examined, including the analysis of current implementations of the algorithm. Then, methods to speed up the computation process and efficient parallel implementation of the algorithm are shown; they include optimization of data representation, recursive decomposition, merging, and removal of redundant data. Besides the discussion of computational complexity, the paper compares the processing times of the proposed solution with those for the well-known analysis and data mining systems. Although the presented idea is focused on searching for all possible solutions, it can be restricted to finding just those of the smallest size. Both approaches are of great application potential, including proving mathematical theorems, logic synthesis, especially index generation functions, or data processing and mining such as feature selection, data discretization, rule generation, etc. The problem considered is NP-hard, and it is easy to point to examples that are not solvable within the expected amount of time. However, the solution allows the barrier of computations to be moved one step further. For example, the unique algorithm can calculate, as the only one at the moment, all minimal sets of features for few standard benchmarks. Unlike many existing methods, the algorithm additionally works with undetermined values. The result of this research is an easily extendable experimental software that is the fastest among the tested solutions and the data mining systems.

MSC:

68T30 Knowledge representation
68P15 Database theory
68T05 Learning and adaptive systems in artificial intelligence

Software:

D-SCIDS; ROSE; RSBR_; UCI-ml
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abraham, A., Jain, R., Thomas, J. and Han, S.Y. (2007). D-SCIDS: Distributed soft computing intrusion detection system, Journal of Network and Computer Applications 30(1): 81-98, DOI: 10.1016/j.jnca.2005.06.001.;
[2] Bache, K. and Lichman, M. (2013). UCI Machine Learning Repository, University of California, Irvine, CA, http://archive.ics.uci.edu/ml;
[3] Bazan, J.G., Szczuka, M.S. and Wróblewski, J. (2002). A new version of Rough Set Exploration System, in J.J. Alpigini et al. (Eds.), Rough Sets and Current Trends in Computing, Lecture Notes in Computer Science, Vol. 2475, Springer, Berlin/Heidelberg, pp. 397-404, DOI: 10.1007/3-540-45813-1_52.; · Zbl 1013.68643
[4] Błaszczyński, J., Greco, S., Matarazzo, B., Słowiński, R. and Szeląg,M. (2013). jMAF-Dominance-based rough set data analysis framework, in A. Skowron and Z. Suraj (Eds.), Rough Sets and Intelligent Systems-Professor Zdzisław Pawlak in Memoriam, Springer, Berlin/Heidelberg, pp. 185-209, DOI: 10.1007/978-3-642-30344-9_5.;
[5] Borowik, G. (2013). Boolean function complementation based algorithm for data discretization, in R. Moreno-Díaz et al. (Eds.), Computer Aided Systems Theory, Lecture Notes in Computer Science, Vol. 8112, Springer, Berlin/Heidelberg, pp. 218-225, DOI: 10.1007/978-3-642-53862-9_28.;
[6] Borowik, G. and Kowalski, K. (2015). Rule induction based on frequencies of attribute values, Proceedings of SPIE: Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 9662: 96623R, DOI: 10.1117/12.2205899.;
[7] Borowik, G., Kowalski, K. and Jankowski, C. (2015a). Novel approach to data discretization, Proceedings of SPIE: Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 9662: 96623U, DOI: 10.1117/12.2205916.;
[8] Borowik, G., Kraśniewski, A. and Łuba, T. (2015b). Rule induction based on logic synthesis methods, in H. Selvaraj et al. (Eds.), Progress in Systems Engineering, Advances in Intelligent Systems and Computing, Vol. 330, Springer International Publishing, Cham, pp. 813-816, DOI: 10.1007/978-3-319-08422-0_118.;
[9] Borowik, G. and Łuba, T. (2014). Fast algorithm of attribute reduction based on the complementation of Boolean function, in R. Klempous et al. (Eds.), Advanced Methods and Applications in Computational Intelligence, Topics in Intelligent Engineering and Informatics, Springer International Publishing, Cham, pp. 25-41, DOI: 10.1007/978-3-319-01436-4_2.;
[10] Brayton, R.K., Hachtel, G.D., McMullen, C.T. and Sangiovanni-Vincentelli, A. (1984). Logic Minimization Algorithms for VLSI Synthesis, Kluwer Academic Publishers, Dordrecht, DOI: 10.1007/978-1-4613-2821-6.; · Zbl 0565.94020
[11] Brzozowski, J.A. and Łuba, T. (1997). Decomposition of Boolean functions specified by cubes, Research report CS- 97-01, University of Waterloo, Waterloo.; · Zbl 1049.06011
[12] Jankowski, C., Reda, D., Mańkowski, M. and Borowik, G. (2015). Discretization of data using Boolean transformations and information theory based evaluation criteria, Bulletin of the Polish Academy of Sciences: Technical Sciences 63(4): 923-932, DOI: 10.1515/bpasts-2015-0105.;
[13] Komorowski, J., Pawlak, Z., Polkowski, L. and Skowron, A. (1999). Rough sets: A tutorial, https://eecs.ceas.uc.edu/ mazlack/dbm.w2011/Komorowski.RoughSets.tutor.pdf.;
[14] Korzen, M. and Jaroszewicz, S. (2005). Finding reducts without building the discernibility matrix, Proceedings of the 5th International Conference on Intelligent Systems Design and Applications, ISDA’05, Wrocław, Poland, pp. 450-455, DOI: 10.1109/ISDA.2005.45.;
[15] Liu, G., Li, L., Yang, J., Feng, Y. and Zhu, K. (2015). Attribute reduction approaches for general relation decision systems, Pattern Recognition Letters 65: 81-87, DOI: 10.1016/j.patrec.2015.06.031.;
[16] Liu, H. and Setiono, R. (1997). Feature selection via discretization, IEEE Transactions on Knowledge and Data Engineering 9(4): 642-645, DOI: 10.1109/69.617056.;
[17] Łuba, T., Borowik, G., Kraśniewski, A., Rutkowski, P. and Ługowska, I. (2014). Application of logic synthesis algorithms for data mining in medical databases, 9th International Seminar Statistics and Clinical Practice, Warsaw, Poland, pp. 36-39.;
[18] Łuba, T. and Rybnik, J. (1992). Intelligent decision support: Handbook of applications and advances of the rough sets theory, in S.Y. Huang (Ed.), Rough Sets and Some Aspects of Logic Synthesis, Springer Netherlands, Dordrecht, pp. 181-199, DOI: 10.1007/978-94-015-7975-9_13.;
[19] Martinović, G., Bajer, D. and Zorić, B. (2014). A differential evolution approach to dimensionality reduction for classification needs, International Journal of Applied Mathematics and Computer Science 24(1): 111-122, DOI: 10.2478/amcs-2014-0009.; · Zbl 1292.68137
[20] Min, F., Hu, Q. and Zhu, W. (2014). Feature selection with test cost constraint, International Journal of Approximate Reasoning 55(1Pt2): 167-179, DOI: 10.1016/j.ijar.2013.04.003.; · Zbl 1316.68117
[21] Nguyen, H.S. (2006). Approximate Boolean reasoning: Foundations and applications in data mining, in J.F. Peters and A. Skowron (Eds.), Transactions on Rough Sets V, Lecture Notes in Computer Science, Vol. 4100, Springer, Berlin/Heidelberg, pp. 334-506, DOI: 10.1007/11847465_16.; · Zbl 1136.68497
[22] Petrick, S.R. (1956). A direct determination of the irredundant forms of a Boolean function from the set of prime implicants, Technical report AFCRC-TR-56-110, Air Force Cambridge Research Center, Cambridge, MA.;
[23] Predki, B., Słowiński, R., Stefanowski, J., Susmaga, R. and Wilk, S. (1998). ROSE-software implementation of the rough set theory, in L. Polkowski and A.;
[24] Skowron (Eds.), Rough Sets and Current Trends in Computing, Springer, Berlin/Heidelberg, pp. 605-608, DOI: 10.1007/3-540-69115-4_85.;
[25] Predki, B. andWilk, S. (1999). Rough set based data exploration using ROSE system, in Z.W. Raś and A. Skowron (Eds.), Foundations of Intelligent Systems: 11th International Symposium, Springer, Berlin/Heidelberg, pp. 172-180, DOI: 10.1007/BFb0095102.;
[26] Sasao, T. (2011). Index generation functions: Recent developments, 41st IEEE International Symposium on Multiple-Valued Logic, Tuusula, Finland, DOI: 10.1109/ISMVL.2011.17.;
[27] Sasao, T. (2015). Index generation functions, EPFL Workshop on Logic Synthesis & Verification, Lausanne, Switzerland.;
[28] Skowron, A. and Rauszer, C. (1992). The discernibility matrices and functions in information systems, in R. Słowiński (Ed.), Intelligent Decision Support-Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht, pp. 331-362.;
[29] Stefanowski, J., Krawiec, K. andWrembel, R. (2017). Exploring complex and big data, International Journal of Applied Mathematics and Computer Science 27(4): 669-679, DOI: 10.1515/amcs-2017-0046.; · Zbl 1396.68104
[30] Steinbach, B. and Posthoff, C. (2012). Improvements of the construction of exact minimal covers of Boolean functions, in R. Moreno-Díaz et al. (Eds.), Computer Aided Systems Theory-EUROCAST 2011, Lecture Notes in Computer Science, Vol. 6928, Springer, Berlin/Heidelberg, pp. 272-279, DOI: 10.1007/978-3-642-27579-1_35.;
[31] Steinbach, B. and Posthoff, C. (2013). Fast calculation of exact minimal unate coverings on both the CPU and the GPU, in R. Moreno-Díaz et al. (Eds.), Computer Aided Systems Theory, Springer, Berlin/Heidelberg, pp. 234-241, DOI: 10.1007/978-3-642-53862-9_30.;
[32] Su, M.-Y., Yu, G.-J. and Lin, C.-Y. (2009). A real-time network intrusion detection system for large-scale attacks based on an incremental mining approach, Computers & Security 28(5): 301-309, DOI: 10.1016/j.cose.2008.12.001.;
[33] Sun, L., Xu, J. and Li, Y. (2014). A feature selection approach of inconsistent decision systems in rough set, Journal of Computers 9(6): 1333-1340, DOI: 10.4304/jcp.9.6.1333-1340.;
[34] Szemenyei, M. and Vajda, F. (2017). Dimension reduction for objects composed of vector sets, International Journal of Applied Mathematics and Computer Science 27(1): 169-180, DOI: 10.1515/amcs-2017-0012.; · Zbl 1368.68286
[35] Zhong, N. and Skowron, A. (2001). A rough set-based knowledge discovery process, International Journal of Applied Mathematics and Computer Science 11(3): 603-619.; · Zbl 0990.68139
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.