×

On the relative expressiveness of Bayesian and neural networks. (English) Zbl 1468.68097

Summary: A neural network computes a function. A central property of neural networks is that they are “universal approximators”: for a given continuous function, there exists a neural network that can approximate it arbitrarily well, given enough neurons (and some additional assumptions). In contrast, a Bayesian network is a model, but each of its queries can be viewed as computing a function. In this paper, we identify some key distinctions between the functions computed by neural networks and those by marginal Bayesian network queries, showing that the former are more expressive than the latter. Moreover, we propose a simple augmentation to Bayesian networks (a testing operator), which enables their marginal queries to become “universal approximators.”

MSC:

68Q06 Networks and circuits as models of computation; circuit complexity
62H22 Probabilistic graphical models
68T05 Learning and adaptive systems in artificial intelligence

Software:

darch
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Bengio, Y.; Lamblin, P.; Popovici, D.; Larochelle, H., Greedy layer-wise training of deep networks, (Advances in Neural Information Processing Systems 19. Advances in Neural Information Processing Systems 19, NIPS (2006)), 153-160
[2] Castillo, E. F.; Gutiérrez, J. M.; Hadi, A. S., Goal oriented symbolic propagation in bayesian networks, (Proceedings of the Thirteenth National Conference on Artificial Intelligence. Proceedings of the Thirteenth National Conference on Artificial Intelligence, AAAI (1996)), 1263-1268
[3] Chan, H.; Darwiche, A., On the revision of probabilistic beliefs using uncertain evidence, Artif. Intell., 163, 67-90 (2005) · Zbl 1132.68767
[4] Chavira, M.; Darwiche, A., Compiling Bayesian networks using variable elimination, (Proceedings of the 20th International Joint Conference on Artificial Intelligence. Proceedings of the 20th International Joint Conference on Artificial Intelligence, IJCAI (2007)), 2443-2449
[5] Choi, A.; Darwiche, A., On relaxing determinism in arithmetic circuits, (Proceedings of the Thirty-Fourth International Conference on Machine Learning. Proceedings of the Thirty-Fourth International Conference on Machine Learning, ICML (2017)), 825-833
[6] Choi, A.; Darwiche, A., On the relative expressiveness of Bayesian and neural networks, (Proceedings of the 9th International Conference on Probabilistic Graphical Models. Proceedings of the 9th International Conference on Probabilistic Graphical Models, PGM (2018))
[7] Cybenko, G., Approximation by superpositions of a sigmoidal function, Math. Control Signals Syst., 2, 303-314 (1989) · Zbl 0679.94019
[8] Darwiche, A., A differential approach to inference in bayesian networks, (Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence. Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence, UAI (2000)), 123-132
[9] Darwiche, A., A differential approach to inference in Bayesian networks, J. ACM, 50, 280-305 (2003) · Zbl 1325.68226
[10] Darwiche, A., Modeling and Reasoning with Bayesian Networks (2009), Cambridge University Press · Zbl 1231.68003
[11] Darwiche, A., Human-level intelligence or animal-like abilities?, Commun. ACM, 61, 56-67 (2018)
[12] Dechter, R., Bucket elimination: a unifying framework for probabilistic inference, (Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence. Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence, UAI (1996)), 211-219
[13] Goodfellow, I.; Bengio, Y.; Courville, A., Deep Learning (2016), MIT Press · Zbl 1373.68009
[14] Halpern, J., An analysis of first-order logics of probability, Artif. Intell., 46, 311-350 (1990) · Zbl 0723.03007
[15] Hinton, G. E.; Osindero, S.; Teh, Y. W., A fast learning algorithm for deep belief nets, Neural Comput., 18, 1527-1554 (2006) · Zbl 1106.68094
[16] Hornik, K.; Stinchcombe, M. B.; White, H., Multilayer feedforward networks are universal approximators, Neural Netw., 2, 359-366 (1989) · Zbl 1383.92015
[17] Jensen, F. V., Gradient descent training of Bayesian networks, (Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty. Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning and Uncertainty, ECSQARU (1999)), 190-200
[18] Jones, L. K., Constructive approximations for neural networks by sigmoidal functions, Proc. IEEE, 78, 1586-1589 (1990)
[19] Kisa, D.; Van den Broeck, G.; Choi, A.; Darwiche, A., Probabilistic sentential decision diagrams, (Proceedings of the 14th International Conference on Principles of Knowledge Representation and Reasoning. Proceedings of the 14th International Conference on Principles of Knowledge Representation and Reasoning, KR (2014))
[20] Kjærulff, U.; van der Gaag, L. C., Making sensitivity analysis computationally efficient, (Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence. Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence, UAI (2000)), 317-325
[21] Lapedes, A. S.; Farber, R. M., How neural nets work, (Neural Information Processing Systems. Neural Information Processing Systems, NIPS (1987)), 442-456
[22] Leshno, M.; Lin, V. Y.; Pinkus, A.; Schocken, S., Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural Netw., 6, 861-867 (1993)
[23] McCarthy, J., Programs with common sense, (Proceedings of the Teddington Conference on the Mechanization of Thought Processes (1959))
[24] McCulloch, W. S.; Pitts, W., A logical calculus of the ideas immanent in nervous activity, Bull. Math. Biophys., 5, 115-133 (1943) · Zbl 0063.03860
[25] Montúfar, G. F.; Pascanu, R.; Cho, K.; Bengio, Y., On the number of linear regions of deep neural networks, (Advances in Neural Information Processing Systems 27. Advances in Neural Information Processing Systems 27, NIPS (2014)), 2924-2932
[26] Neal, R. M., Connectionist learning of belief networks, Artif. Intell., 56, 71-113 (1992) · Zbl 0761.68081
[27] Nilsson, N., Probabilistic logic, Artif. Intell., 28, 71-87 (1986) · Zbl 0589.03007
[28] Pascanu, R.; Montúfar, G.; Bengio, Y., On the Number of Inference Regions of Deep Feed Forward Networks With Piece-Wise Linear Activations (2014)
[29] Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. MK (1988)
[30] Poole, D., First-order probabilistic inference, (IJCAI (2003)), 985-991
[31] Raghu, M.; Poole, B.; Kleinberg, J. M.; Ganguli, S.; Sohl-Dickstein, J., On the expressive power of deep neural networks, (ICML (2017))
[32] Ranzato, M.; Poultney, C. S.; Chopra, S.; LeCun, Y., Efficient learning of sparse representations with an energy-based model, (Advances in Neural Information Processing Systems 19. Advances in Neural Information Processing Systems 19, NIPS (2006)), 1137-1144
[33] Rosenblatt, F., The perceptron: a probabilistic model for information storage and organization in the brain, Psychol. Rev., 65, 386-408 (1958)
[34] Saul, L. K.; Jaakkola, T. S.; Jordan, M. I., Mean field theory for sigmoid belief networks, J. Artif. Intell. Res., 4, 61-76 (1996) · Zbl 0900.68379
[35] Serra, T.; Tjandraatmadja, C.; Ramalingam, S., Bounding and counting linear regions of deep neural networks, (ICML (2018))
[36] Shen, Y.; Choi, A.; Darwiche, A., Conditional PSDDs: modeling and learning with modular knowledge, (Proceedings of the 32nd AAAI Conference on Artificial Intelligence. Proceedings of the 32nd AAAI Conference on Artificial Intelligence, AAAI (2018))
[37] Shen, Y.; Huang, H.; Choi, A.; Darwiche, A., Conditional independence in testing bayesian networks, (Proceedings of the Thirty-Sixth International Conference on Machine Learning. Proceedings of the Thirty-Sixth International Conference on Machine Learning, ICML (2019)), 5701-5709
[38] Vomlel, J., Noisy-or classifier, Int. J. Intell. Syst., 21, 381-398 (2006) · Zbl 1160.68584
[39] Zhang, N. L.; Poole, D., Exploiting causal independence in bayesian network inference, J. Artif. Intell. Res., 5, 301-328 (1996) · Zbl 0900.68384
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.