×

PRISM revisited: declarative implementation of a probabilistic programming language using multi-prompt delimited control. (English) Zbl 1448.68202

Summary: PRISM is a probabilistic programming language based on Prolog, augmented with primitives to represent probabilistic choice. It is implemented using a combination of low level support from a modified version of B-Prolog, source level program transformation, and libraries for inference and learning implemented in C. More recently, developers working with functional programming languages have taken the approach of embedding probabilistic primitives into an existing language, with little or no modification to the host language, often by using delimited continuations. Captured continuations represent pieces of the probabilistic program which can be manipulated to achieve a great variety of computational effects useful for inference. In this paper, I will describe an approach based on delimited control operators recently introduced into SWI Prolog. These are used to create a system of nested effect handlers which together implement a core functionality of PRISM – the building of explanation graphs – entirely in Prolog and using an order of magnitude less code. Other declarative programming tools, such as constraint logic programming, are used to implement tools for inference, such as the inside-outside and EM algorithms, lazy best-first explanation search, and MCMC samplers. By embedding the functionality of PRISM into SWI Prolog, users gain access to its rich libraries and development environment. By expressing the functionality of PRISM in a small amount of pure, high-level Prolog, this implementation facilitates further experimentation with the mechanisms of probabilistic logic programming, including new probabilistic modelling features and inference algorithms, such as variational inference in models with real-valued variables.

MSC:

68N15 Theory of programming languages
68N17 Logic programming
68T05 Learning and adaptive systems in artificial intelligence
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Abdallah, S., Automatic differentiation using Constraint Handling Rules in Prolog, (2017), arXiv preprint
[2] Abdallah, S., Memoisation: purely, left-recursively, and with (continuation passing) style, (2017), arXiv preprint
[3] Abdallah, S., More declarative tabling in Prolog using multi-prompt delimited control, (2017), arXiv preprint
[4] Baker, J. K., Trainable grammars for speech recognition, J. Acoust. Soc. Am., 65, S1, (1979), S132-S132
[5] Barndorff-Nielsen, O.; Jupp, P., Approximating exponential models, Ann. Inst. Stat. Math., 41, 2, 247-267, (1989) · Zbl 0721.62003
[6] Beal, M. J.; Ghahramani, Z., The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures, Bayesian Stat., 7, 453-464, (2003)
[7] Chen, W.; Warren, D. S., Towards effective evaluation of general logic programs, (Proceedings of the 12th PODS, (1993), Citeseer)
[8] Danvy, O.; Filinski, A., Abstracting control, (Proceedings of the 1990 ACM Conference on LISP and Functional Programming, (1990), ACM), 151-160
[9] Darwiche, A., A differential approach to inference in Bayesian networks, (Uncertainty in Artificial Intelligence, UAI 2000, (2000)), 123-132
[10] De Raedt, L.; Kimmig, A.; Toivonen, H., Problog: a probabilistic Prolog and its application in link discovery, (IJCAI, vol. 7, (2007)), 2462-2467
[11] Desouter, B.; Van Dooren, M.; Schrijvers, T., Tabling as a library with delimited control, Theory Pract. Log. Program., 15, 4-5, 419-433, (2015) · Zbl 1379.68054
[12] Eisner, J.; Goldlust, E.; Smith, N. A., Dyna: a declarative language for implementing dynamic programs, (Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, ACL 2004, (2004)), 32, Companion Volume
[13] Erwig, M.; Kollmansberger, S., Functional pearls: probabilistic functional programming in Haskell, J. Funct. Program., 16, 01, 21-34, (2006) · Zbl 1091.68023
[14] Felleisen, M., The theory and practice of first-class prompts, (Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of Programming Languages, (1988), ACM), 180-190
[15] Filinski, A., Representing layered monads, (Proceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, (1999), ACM), 175-188
[16] Frühwirth, T., Theory and practice of constraint handling rules, J. Funct. Logic Program., 37, 1, 95-138, (1998) · Zbl 0920.68029
[17] Ghahramani, Z.; Beal, M. J., Propagation algorithms for variational Bayesian learning, (Advances in Neural Information Processing Systems, (2001)), 507-513
[18] Gibbons, J., Calculating functional programs, (Algebraic and Coalgebraic Methods in the Mathematics of Program Construction, (2002), Springer), 151-203, includes fork and join for products and coproducts (sums)
[19] Goodman, J., Semiring parsing, Comput. Linguist., 25, 4, 573-605, (1999)
[20] Goodman, N.; Mansinghka, V.; Roy, D. M.; Bonawitz, K.; Tenenbaum, J., Church: a language for generative models, (Uncertainty in Artificial Intelligence, (2008))
[21] Gutmann, B.; Thon, I.; Kimmig, A.; Bruynooghe, M.; De Raedt, L., The magic of logical inference in probabilistic programming, Theory Pract. Log. Program., 11, 4-5, 663-680, (2011) · Zbl 1222.68060
[22] Hensman, J.; Rattray, M.; Lawrence, N. D., Fast variational inference in the conjugate exponential family, (Advances in Neural Information Processing Systems, (2012)), 2888-2896
[23] Hoffman, M. D.; Blei, D. M.; Wang, C.; Paisley, J., Stochastic variational inference, J. Mach. Learn. Res., 14, 1, 1303-1347, (2013) · Zbl 1317.68163
[24] Huang, L.; Chiang, D., Better k-best parsing, (Proceedings of the Ninth International Workshop on Parsing Technology. Association for Computational Linguistics, (2005)), 53-64, combines the k-best parses of each factor in an explanation to get the k-best parses of an explanation, then combines the k-best parses of alternative explanations to get the k-best parses of a goal
[25] Islam, M. A.; Ramakrishnan, C.; Ramakrishnan, I., Parameter learning in prism programs with continuous random variables, (2012), arXiv preprint
[26] Johnson, M., Memoization in top-down parsing, Comput. Linguist., 21, 3, 405-417, (1995)
[27] Kimmig, A.; Van den Broeck, G.; De Raedt, L., An algebraic Prolog for reasoning about possible worlds, (Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, (2011), AAAI Press), 209-214
[28] Kiselyov, O.; Shan, C. Chieh, Embedded probabilistic programming, (Abstract of Poster NIPS2008 Workshop on Probabilistic Programming: Universal Languages, Systems and Applications, (2008))
[29] Kiselyov, O.; Sabry, A.; Swords, C., Extensible effects: an alternative to monad transformers, (ACM SIGPLAN Notices, vol. 48, (2013), ACM), 59-70
[30] Klein, D.; Manning, C. D., Parsing and hypergraphs, (New Developments in Parsing Technology, (2004), Springer), 351-372
[31] Lauritzen, S. L., Graphical Models, (1996), Oxford University Press: Oxford University Press New York · Zbl 0907.62001
[32] Meijer, E.; Fokkinga, M.; Paterson, R., Functional programming with bananas, lenses, envelopes and barbed wire, (Functional Programming Languages and Computer Architecture, (1991), Springer), 124-144
[33] Muggleton, S., Stochastic logic programs, (Advances in Inductive Logic Programming, vol. 32, (1996)), 254-264
[34] Ng, R.; Subrahmanian, V. S., Probabilistic logic programming, Inf. Comput., 101, 2, 150-201, (1992) · Zbl 0781.68038
[35] Pearlmutter, B. A.; Siskind, J. M., Reverse-mode AD in a functional framework: lambda the ultimate backpropagator, ACM Trans. Program. Lang. Syst. (TOPLAS), 30, 2, 7, (2008)
[36] Pereira, F. C.; Warren, D. H., Definite clause grammars for language analysis—a survey of the formalism and a comparison with augmented transition networks, Artif. Intell., 13, 3, 231-278, (1980) · Zbl 0442.68088
[37] Pfeffer, A., IBAL: a probabilistic rational programming language, (IJCAI, (2001), Citeseer), 733-740
[38] Plotkin, G. D.; Pretnar, M., Handling algebraic effects, Log. Methods Comput. Sci., 9, (2013) · Zbl 1314.68191
[39] Poole, D., Representing Bayesian networks within probabilistic Horn abduction, (Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence, (1991), Morgan Kaufmann Publishers Inc.), 271-278
[40] Poole, D., Probabilistic Horn abduction and Bayesian networks, Artif. Intell., 64, 1, 81-129, (1993) · Zbl 0792.68176
[41] Porter, H. H., Earley Deduction, (1986), Oregon Graduate Center, Tech. rep.
[42] Rabiner, L. R., A tutorial on hidden Markov models and selection applications in speech recognition, Proc. IEEE, 77, 2, 257-286, (1989)
[43] Radul, A., Report on the probabilistic language scheme, (Proceedings of the 2007 Symposium on Dynamic Languages, (2007), ACM), 2-10
[44] Raiko, T.; Valpola, H.; Harva, M.; Karhunen, J., Building blocks for variational Bayesian learning of latent variable models, J. Mach. Learn. Res., 155-201, (Jan 2007)
[45] Ranganath, R.; Gerrish, S.; Blei, D., Black box variational inference, (Kaski, S.; Corander, J., Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 33, (22-25 April 2014), PMLR: PMLR Reykjavik, Iceland), 814-822
[46] Riguzzi, F.; Swift, T., Tabling and answer subsumption for reasoning on logic programs with annotated disjunctions, (LIPIcs-Leibniz International Proceedings in Informatics, vol. 7, (2010), Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik) · Zbl 1237.68049
[47] Saleh, A. H.; Schrijvers, T., Efficient algebraic effect handlers for Prolog, Theory Pract. Log. Program., 16, 5-6, 884-898, (2016) · Zbl 1379.68080
[48] Sato, M.-A., Online model selection based on the variational Bayes, Neural Comput., 13, 7, 1649-1681, (2001) · Zbl 1013.62087
[49] Sato, T., A statistical learning method for logic programs with distribution semantics, (Proceedings of the 12th International Conference on Logic Programming, ICLP’95, Tokyo, (1995)), 715-729
[50] Sato, T.; Kameya, Y., PRISM: a language for symbolic-statistical modeling, (Proc. 15th Intl. Joint Conf. on Artificial Intelligence, vol. 2, IJCAI, (1997)), 1330-1335, includes explanation search and EM learning, but explanation search is not tabled. The result of explanation search is more like Poole’s PHA - each explanation is a list of assignments to random variables, each of which has a unique identifier
[51] Sato, T.; Kameya, Y., Parameter learning of logic programs for symbolic-statistical modeling, J. Artif. Intell. Res., 15, 391-454, (2001), first journal paper describing tabling for efficient explanation search · Zbl 0994.68025
[52] Sato, T.; Kameya, Y.; Kurihara, K., Variational Bayes via propositionalized probability computation in PRISM, Ann. Math. Artif. Intell., 54, 1-3, 135-158, (2008) · Zbl 1178.68591
[53] Sato, T.; Kubota, K., Viterbi training in PRISM, (ICML Workshop on Statistical Relational Learning, SRL-2012, (2012)) · Zbl 1379.68272
[54] Schrijvers, T.; Costa, V. S.; Wielemaker, J.; Demoen, B., Towards typed Prolog, (International Conference on Logic Programming, (2008), Springer), 693-697 · Zbl 1185.68181
[55] Schrijvers, T.; Demoen, B.; Desouter, B.; Wielemaker, J., Delimited continuations for Prolog, Theory Pract. Log. Program., 13, 4-5, 533-546, (2013) · Zbl 1312.68037
[56] Stuhlmüller, A.; Goodman, N. D., A dynamic programming algorithm for inference in recursive probabilistic programs, (2012), arXiv preprint
[57] Sussman, G. J.; Steele, G. L., Scheme: A Interpreter for Extended Lambda Calculus, (1975), AI Memo 349
[58] Swift, T., Tabling for non-monotonic programming, Ann. Math. Artif. Intell., 25, 3-4, 201-240, (1999) · Zbl 0940.68025
[59] Tamaki, H.; Sato, T., OLD resolution with tabulation, (Third International Conference on Logic Programming, (1986), Springer), 84-98 · Zbl 0607.68072
[60] Tolpin, D.; van de Meent, J.-W.; Wood, F., Probabilistic programming in Anglican, (Joint European Conference on Machine Learning and Knowledge Discovery in Databases, (2015), Springer), 308-311
[61] Tolpin, D.; van de Meent, J. W.; Yang, H.; Wood, F., Design and implementation of probabilistic programming language Anglican, (2016), arXiv preprint
[62] Ueda, N.; Nakano, R., Deterministic annealing EM algorithm, Neural Netw., 11, 2, 271-282, (1998)
[63] Wadler, P., Comprehending monads, Math. Struct. Comput. Sci., 2, 04, 461-493, (1992) · Zbl 0798.68040
[64] D.H. Warren, 1975, Earley deduction. Unpublished note.
[65] Wingate, D.; Goodman, N.; Stuhlmueller, A.; Siskind, J. M., Nonstandard interpretations of probabilistic programs for efficient inference, (Advances in Neural Information Processing Systems, (2011)), 1152-1160
[66] Wingate, D.; Weber, T., Automated variational inference in probabilistic programming, (2013), arXiv preprint
[67] Winn, J.; Bishop, C. M., Variational message passing, J. Mach. Learn. Res., 6, 661-694, (Apr 2005)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.