Learning continuous time Bayesian network classifiers.

*(English)*Zbl 1433.68335Summary: Streaming data are relevant to finance, computer science, and engineering while they are becoming increasingly important to medicine and biology. Continuous time Bayesian network classifiers are designed for analyzing multivariate streaming data when time duration of event matters. Structural and parametric learning for the class of continuous time Bayesian network classifiers are considered in the case where complete data is available. Conditional log-likelihood scoring is developed for structural learning on continuous time Bayesian network classifiers. Performance of continuous time Bayesian network classifiers learned when combining conditional log-likelihood scoring and Bayesian parameter estimation are compared with that achieved by continuous time Bayesian network classifiers when learning is based on marginal log-likelihood scoring and to that achieved by dynamic Bayesian network classifiers. Classifiers are compared in terms of accuracy and computation time. Comparison is based on numerical experiments where synthetic and real data are used. Results show that conditional log-likelihood scoring combined with Bayesian parameter estimation outperforms marginal log-likelihood scoring. Conditional log-likelihood scoring becomes even more effective when the amount of available data is limited. Continuous time Bayesian network classifiers outperform in terms of computation time and accuracy dynamic Bayesian network on synthetic and real data sets.

Reviewer: Reviewer (Berlin)

##### MSC:

68T05 | Learning and adaptive systems in artificial intelligence |

62F15 | Bayesian inference |

62H30 | Classification and discrimination; cluster analysis (statistical aspects) |

##### Keywords:

streaming data; multivariate trajectory; continuous-time classification; continuous-time Bayesian networks
PDF
BibTeX
XML
Cite

\textit{D. Codecasa} and \textit{F. Stella}, Int. J. Approx. Reasoning 55, No. 8, 1728--1746 (2014; Zbl 1433.68335)

Full Text:
DOI

##### References:

[1] | Barber, D.; Cemgil, A., Graphical models for time-series, IEEE Signal Process. Mag., 27, 18-28, (2010) |

[2] | Dacorogna, M., An introduction to high-frequency finance, (2001), AP |

[3] | Villa, S.; Stella, F., A continuous time Bayesian network classifier for intraday FX prediction, Quant. Finance, 1-14, (2014) · Zbl 1402.91818 |

[4] | Simma, A.; Jordan, M., Modeling events with cascades of Poisson processes, (Proc. of the 26th Conf. on UAI, (2010), AUAI), 546-555 |

[5] | Yilmaz, A.; Javed, O.; Shah, M., Object tracking: a survey, ACM Comput. Surv., 38, 1-45, (2006) |

[6] | Truccolo, W.; Eden, U.; Fellows, M.; Donoghue, J.; Brown, E., A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects, J. Neurophysiol., 93, 1074-1089, (2005) |

[7] | Acerbi, E.; Stella, F., Continuous time Bayesian networks for gene network reconstruction: a comparative study on time course data, (Proceedings of the 10th International Symposium on Bioinformatics Research and Applications, (2014)), in press |

[8] | Voit, E., A first course in systems biology, (2012), Garland Science NY |

[9] | Dean, T.; Kanazawa, K., A model for reasoning about persistence and causation, Comput. Intell., 5, 142-150, (1989) |

[10] | Rabiner, L., A tutorial on hidden Markov models and selected applications in speech recognition, Proc. IEEE, 77, 257-286, (1989) |

[11] | Gunawardana, A.; Meek, C.; Xu, P., A model for temporal dependencies in event streams, (Shawe-Taylor, J.; Zemel, R.; Bartlett, P.; Pereira, F.; Weinberger, K., Advances in Neural Information Processing Systems, (2011)), 1962-1970 |

[12] | Nodelman, U.; Koller, D.; Shelton, C., Expectation propagation for continuous time Bayesian networks, (Proc. of the 21st Conf. on UAI, Edinburgh, Scotland, UK, (2005)), 431-440 |

[13] | Simma, A.; Goldszmidt, M.; MacCormick, J.; Barham, P.; Black, R.; Isaacs, R.; Mortier, R., CT-nor: representing and reasoning about events in continuous time, (Proc. of the 24th Conf. on UAI, (2008), AUAI), 484-493 |

[14] | Rajaram, S.; Graepel, T.; Herbrich, R., Poisson-networks: a model for structured point processes, (Proc. of the 10th Int. Workshop on Artificial Intelligence and Statistics, (2005)), 277-284 |

[15] | Weiss, J. C.; Page, D., Forest-based point processes for event prediction from electronic health records, Mach. Learn. Knowl. Discov. Databases, 8190, 547-562, (2013) |

[16] | Zhong, S.; Langseth, H.; Nielsen, T. D., Bayesian networks for dynamic classification, (2012), Technical report |

[17] | Stella, F.; Amer, Y., Continuous time Bayesian network classifiers, J. Biomed. Inform., 45, 1108-1119, (2012) |

[18] | Kelner, R.; Lerner, B., Learning Bayesian network classifiers by risk minimization, Int. J. Approx. Reason., 53, 248-272, (2012) · Zbl 1242.68230 |

[19] | Friedman, N.; Geiger, D.; Goldszmidt, M., Bayesian network classifiers, Mach. Learn., 29, 131-163, (1997) · Zbl 0892.68077 |

[20] | Greiner, R.; Zhou, W., Structural extension to logistic regression: discriminative parameter learning of belief net classifiers, (Proc. of the National Conf. on AI, Menlo Park, CA, (2002)), 167-173 |

[21] | Grossman, D.; Domingos, P., Learning Bayesian network classifiers by maximizing conditional likelihood, (Proc. of the 21st Int. Conf. on Machine Learning, (2004), ACM), 361-368 |

[22] | Codecasa, D.; Stella, F., Ctbnctoolkit: continuous time Bayesian network classifier toolkit, (2014) |

[23] | Nodelman, U.; Shelton, C.; Koller, D., Continuous time Bayesian networks, (Proc. of the 18th Conf. on UAI, (2002), Morgan Kaufmann), 378-387 |

[24] | Xu, J.; Shelton, C., Continuous time Bayesian networks for host level network intrusion detection, Mach. Learn. Knowl. Discov. Databases, 613-627, (2008) |

[25] | Boudali, H.; Dugan, J., A continuous-time Bayesian network reliability modeling, and analysis framework, IEEE Trans. Reliab., 55, 86-97, (2006) |

[26] | Fan, Y.; Shelton, C., Learning continuous-time social network dynamics, (Proc. of the 25th Conf. on UAI, AUAI, (2009)), 161-168 |

[27] | Gatti, E.; Luciani, D.; Stella, F., A continuous time Bayesian network model for cardiogenic heart failure, Flex. Serv. Manuf. J., 1-20, (2011) |

[28] | Saria, S.; Nodelman, U.; Koller, D., Reasoning at the right time granularity, (UAI, (2007)), 326-334 |

[29] | El-Hay, T.; Cohn, I.; Friedman, N.; Kupferman, R., Continuous-time belief propagation, (Proceedings of the 27th International Conference on Machine Learning (ICML), (2010)), 343-350 |

[30] | Cohn, I.; El-Hay, T.; Friedman, N.; Kupferman, R., Mean field variational approximation for continuous-time Bayesian networks, J. Mach. Learn. Res., 11, 2745-2783, (2010) · Zbl 1242.68215 |

[31] | Fan, Y.; Shelton, C., Sampling for approximate inference in continuous time Bayesian networks, (10th Int. Symposium on Artificial Intelligence and Mathematics, (2008)) |

[32] | El-Hay, T.; Friedman, N.; Kupferman, R., Gibbs sampling in factorized continuous-time Markov processes, (McAllester, D. A.; Myllym, P., Proc. of the 24th Conf. on UAI, (2008), AUAI), 169-178 |

[33] | Rao, V.; Teh, Y. W., Fast mcmc sampling for Markov jump processes and continuous time Bayesian networks, (2012) |

[34] | Nodelman, U.; Shelton, C.; Koller, D., Learning continuous time Bayesian networks, (Proc. of the 19th Conf. on UAI, (2002), Morgan Kaufmann), 451-458 |

[35] | Chickering, D. M.; Heckerman, D.; Meek, C., Large-sample learning of Bayesian networks is NP-hard, J. Mach. Learn. Res., 5, 1287-1330, (2004) · Zbl 1222.68169 |

[36] | Codecasa, D.; Stella, F., Conditional log-likelihood for continuous time Bayesian network classifiers, (International Workshop NFMCP Held at ECML-PKDD 2013, (2013)) |

[37] | Witten, I. H.; Frank, E., Data mining: practical machine learning tools and techniques, (2005), Morgan Kaufmann · Zbl 1076.68555 |

[38] | Murphy, K., The Bayes net toolbox for Matlab, Comput. Sci. Stat., 33, 1024-1034, (2001) |

[39] | Neapolitan, R. E., Learning Bayesian networks, (2004), Pearson Prentice Hall Upper Saddle River |

[40] | Perrier, E.; Imoto, S.; Miyano, S., Finding optimal Bayesian network given a super-structure, J. Mach. Learn. Res., 9, 2251-2286, (2008) · Zbl 1225.68206 |

[41] | Friedman, N.; Murphy, K.; Russell, S., Learning the structure of dynamic probabilistic networks, (Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, (1998), Morgan Kaufmann Publishers Inc.), 139-147 |

[42] | Liang, S.; Fuhrman, S.; Somogyi, R., Reveal, a general reverse engineering algorithm for inference of genetic network architectures, (Pacific Symposium on Biocomputing, vol. 3, (1998)), 2 |

[43] | Tormene, P.; Giorgino, T.; Quaglini, S.; Stefanelli, M., Matching incomplete time series with dynamic time warping: an algorithm and an application to post-stroke rehabilitation, Artif. Intell. Med., 45, 11-34, (2009) |

[44] | Paolucci, S.; Antonucci, G.; Grasso, M. G.; Morelli, D.; Troisi, E.; Coiro, P.; Bragoni, M., Early versus delayed inpatient stroke rehabilitation: a matched comparison conducted in Italy, Arch. Phys. Med. Rehabil., 81, 695-700, (2000) |

[45] | Kwakkel, G.; van Peppen, R.; Wagenaar, R. C.; Dauphinee, S. W.; Richards, C.; Ashburn, A.; Miller, K.; Lincoln, N.; Partridge, C.; Wellwood, I., Effects of augmented exercise therapy time after stroke a meta-analysis, Stroke, 35, 2529-2539, (2004) |

[46] | Forster, A.; Young, J., The clinical and cost effectiveness of physiotherapy in the management of elderly people following a stroke, (2002), The Chartered Society of Physiotherapy London, UK |

[47] | Keogh, E.; Ratanamahatana, C. A., Exact indexing of dynamic time warping, Knowl. Inf. Syst., 7, 358-386, (2005) |

[48] | Koller, D.; Friedman, N., Probabilistic graphical models: principles and techniques, (2009), The MIT Press |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.