×

Structure learning of Bayesian networks using global optimization with applications in data classification. (English) Zbl 1327.90355

Summary: Bayesian Networks are increasingly popular methods of modeling uncertainty in artificial intelligence and machine learning. A Bayesian Network consists of a directed acyclic graph in which each node represents a variable and each arc represents probabilistic dependency between two variables. Constructing a Bayesian Network from data is a learning process that consists of two steps: learning structure and learning parameter. Learning a network structure from data is the most difficult task in this process. This paper presents a new algorithm for constructing an optimal structure for Bayesian Networks based on optimization. The algorithm has two major parts. First, we define an optimization model to find the better network graphs. Then, we apply an optimization approach for removing possible cycles from the directed graphs obtained in the first part which is the first of its kind in the literature. The main advantage of the proposed method is that the maximal number of parents for variables is not fixed a priory and it is defined during the optimization procedure. It also considers all networks including cyclic ones and then choose a best structure by applying a global optimization method. To show the efficiency of the algorithm, several closely related algorithms including unrestricted dependency Bayesian Network algorithm, as well as, benchmarks algorithms SVM and C4.5 are employed for comparison. We apply these algorithms on data classification; data sets are taken from the UCI machine learning repository and the LIBSVM.

MSC:

90C35 Programming involving graphs or networks
90B15 Stochastic network models in operations research
90C26 Nonconvex programming, global optimization

Software:

UCI-ml; LIBSVM; C4.5
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Asuncion, A., Newman, D.: UCI machine learning repository. School of Information and Computer Science, University of California. http://www.ics.uci.edu/mlearn/MLRepository.html (2007)
[2] Bender, M.A., Fineman, J.T. Gilbert, S.: A new approach to incremental topological ordering. In: Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms Society for Industrial and Applied Mathematics. Philadelphia (2009) · Zbl 1423.68327
[3] Campos, L., Fernandez-Luna, M., Gamez, A., Puerta, M.: Ant colony optimization for learning Bayesian networks. Int. J. Approx. Reason. 31(3), 291-311 (2002) · Zbl 1033.68091
[4] Castillo, E., Gutierrez, J.M., Hadi, A.S.: Expert Systems and Probabilistic Network Models. Springer, New York (1997) · Zbl 0867.68099 · doi:10.1007/978-1-4612-2270-5
[5] Chang, C., Lin, C.: LIBSVM: a library for support vector machines, 2001a. http://www.csie.ntu.edu.tw/cjlin/libsvm (2001)
[6] Chickering, D.M.: Learning Bayesian networks is NP-complete. In: Fisher, D., Lenz, H.-J. (eds.) Learning from Data: Artificial Intelligence and Statistics, pp. 121-130. Springer-Verlag (1996) · Zbl 1222.68169
[7] Daly, R., Shen, Q.: Learning Bayesian network equivalence classes with ant colony optimization. J. Artif. Intell. Res. 35, 391-447 (2009) · Zbl 1192.68670
[8] Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classier under zero-one loss. Mach. Learn. 29, 103-130 (1997) · Zbl 0892.68076 · doi:10.1023/A:1007413511361
[9] Fayyad, U.M., Irani, K.B.: On the handling of continuous-valued attributes in decision tree generation. Mach. Learn. 8, 87-102 (1993) · Zbl 0767.68084
[10] Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29, 131-163 (1997) · Zbl 0892.68077 · doi:10.1023/A:1007465528199
[11] Haeupler, B., Kavitha, T., Mathew, R., Sen, S., Tarjan, R.E.: Incremental cycle detection, topological ordering, and strong component maintenance. In: 35th International Colloquium on Automata, Languages, and Programming (ICALP). Reykjavik, Iceland (2008) · Zbl 1295.05234
[12] Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian Networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197-243 (1995) · Zbl 0831.68096
[13] Heckerman, D., Chickering, D., Meek, C.: Large-sample learning of Bayesian Networks is NP-hard. Mach. Learn. 5, 1287-1330 (2004) · Zbl 1222.68169
[14] Janzura, M., Nielsen, J.: A simulated annealing-based method for learning Bayesian Networks from statistical data. Int. J. Intell. Syst. 21, 335-348 (2006) · Zbl 1101.62015 · doi:10.1002/int.20138
[15] Jensen, F.: An Introduction to Bayesian Networks. Springer, New York (1996)
[16] Ji, Z., Zhong, H., Hu, R., Liu, C.: A Bayesian Network learning algorithm based on independence test and ant colony optimization. Acta Autom. Sin. 35(3), 281-288 (2009) · Zbl 1212.68129
[17] Kabli, R., Herrmann, F., McCall, J.: A chain-model genetic algorithm for Bayesian Network structure learning. In: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation. ACM, New York (2007) · Zbl 0892.68076
[18] Kolda, T. G., Lewis, R. M., Torczon, V.: Optimization by direct search: new perspectives on some classical and modern methods. SIAM Rev. 45(3), 385-482 (2003) · Zbl 1059.90146
[19] Kouhbor, S., Ugon, J., Rubinov, A., Kruger, A., Mammadov, M.: Coverage in WLAN with minimum number of access points. In: Vehicular Technology Conference, VTC Spring, pp. 1166-1170 (2006) · Zbl 1192.68670
[20] Langley, P., Iba, W., Thompson, K.: An analysis of Bayesian classiers. In: 10th International Conference Artificial Intelligence, pp. 223-228. AAAI Press (1992)
[21] Larranaga, P., Murga, R., Poza, M., Kuijpers, C.: Structure Learning of Bayesian Networks by Hybrid Genetic Algorithms. Preliminary papers 5th international workshop artificial intelligence and statistics, 310-316 (1995)
[22] Larranaga, P., Poza, M., Yurramendi, Y., Murga, H., Kuijpers, C.: Structure learning of Bayesian networks by genetic algorithms: a performance analysis of control parameters. In: IEEE Transactions on Pattern Analysis and Machine Intelligence archive (1996)
[23] Larranaga, P., Sierra, B., Gallego, J., Michelena, J., Picaza. M.: Learning Bayesian Networks by genetic algorithms: A case study in the prediction of survival in malignant skin melanoma. Artif. Intell. Med. 261-272 (1997)
[24] Mammadov, M.A., Rubinov, A.M., Sniedovich, M.: A new global optimization algorithm based on dynamical systems approach. In: 6th International Conference on Optimization: Techniques and Applications. Ballarat, Australia (2004)
[25] Mammadov, M.A., Rubinov, A.M., Yearwood, J.: Dynamical systems described by relational elasticities with applications to global optimization. In: Jeyakumar, V., Rubinov, A. (eds.) Continuous Optimisation: Current Trends and Modern Applications, pp. 365-387. Springer (2005) · Zbl 1129.90044
[26] Mammadov, M.A., Orsi, R.: H_infinity systhesis via a nonsmooth, nonconvex optimization approach. Pac. J. Optim. 1(2), 405-420 (2005) · Zbl 1105.90064
[27] Marinescu, R., Dechter, R.: AND/OR branch-and-bound search for combinatorial optimization in graphical models. Artif. Intell. 173, 1457-1491 (2009) · Zbl 1185.68648 · doi:10.1016/j.artint.2009.07.003
[28] Maroosi, A., Amiri, B.: A new clustering algorithm based on hybrid global optimization based on a dynamical systems approach algorithm. Expert Syst. Appl. 37(8), 5645-5652 (2010)
[29] Park, H., Cho, S.: An Effcient Attribute Ordering Optimization in Bayesian Networks for Prognostic Modeling of the Metabolic Syndrome, pp. 381-391. Springer, Berlin (2006)
[30] Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988) · Zbl 0746.68089
[31] Richter, F., Fettweis, G.: Base Station Placement Based on Force Fields. IEEE VTC-Spring, Yokohama (2012) · doi:10.1109/VETECS.2012.6240328
[32] Robinson, R.W.: Counting Unlabeled Acyclic Diagraphs, pp. 28-43. Springer, New York (1997)
[33] Sahami, M.: Learning limited dependence Bayesian classiers. In: The 2nd International Conference. Knowledge Discovery and Data mining (KKD), pp. 335-338 (1996)
[34] Sahin, F., Yavuz, M.C., Arnavut, Z., Uluyol, O.: Fault diagnosis for airplane engines using Bayesian networks and distributed particle swarm optimization. Parallel Comput. 33(2), 124-143 (2007)
[35] Schleip, C., Rais, A., Menzel, A.: Bayesian analysis of temperature sensitivity of plant phenology in Germany. Agric. For. Meteorol. 149, 1699-1708 (2009) · doi:10.1016/j.agrformet.2009.05.014
[36] Sedgewick, R: Algorithms. Addison-Wesley Publishing Company. (1983)
[37] Shafer, G., Pearl, J.: Readings in Uncertain Reasoning. Morgan Kaufmann, San Mateo (1990) · Zbl 0805.68121
[38] Sun, W., Yuan, Y.X.: Optimization Theory and Methods, Nonlinear Programming. Springer Optimization and its Applications. vol. 1 (2006) · Zbl 1129.90002
[39] Taheri, S., Mammadov, M.: Solving systems of nonlinear equations using a globally convergent optimization algorithm. Glob. J. Technol. Optim. vol. 3, 132-138 (2012) · Zbl 1185.68648
[40] Taheri, S., Mammadov, M. Seifollahi, S.: Globally convergent optimization methods for unconstrained problems. Optimization: A journal of mathematical programming and operations research. pp. 124-143 (2012)
[41] Taheri, S., Mammadov, M.: Structure learning of Bayesian networks using a new unrestricted dependency algorithm. In: Second International Conference on Social Eco-Informatics. Venice, Italy (2012)
[42] Tilakaratne, C.D., Mammadov, M., Morris, S.A.: Modied neural network algorithms for predicting trading signals of stock market indices. J. Appl. Math. Decis. Sci. (2009) · Zbl 1175.91014
[43] Tucker, A.: Covering Circuits and Graph Coloring, Applied Combinatorics, 5th edn. John Wiley and sons, Hoboken (2006)
[44] Yatsko, A., Bagirov, A., Stranieri, A.: On the discretization of continuous features for classication. In: The Proceedings of Ninth Australasian Data Mining Conference (AusDM 2011). Ballarat, Australia (2011) · Zbl 0892.68077
[45] Zhao, J., Sun, J., Xu, W., Zhou, D.: Structure learning of Bayesian networks based on discrete binary quantum-behaved particle swarm optimization algorithm. In: Proceedings of the Fifth International Conference on Natural Computation. IEEE Computer Society, Washington (2009)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.