zbMATH — the first resource for mathematics

Stratified exponential families: Graphical models and model selection. (English) Zbl 1012.62012
Summary: We describe a hierarchy of exponential families which is useful for distinguishing types of graphical models. Undirected graphical models with no hidden variables are linear exponential families (LEFs). Directed acyclic graphical (DAG) models and chain graphs with no hidden variables, including DAG models with several families of local distributions, are curved exponential families (CEFs). Graphical models with hidden variables are what we term stratified exponential families (SEFs). A SEF is a finite union of CEFs of various dimensions satisfying some regularity conditions.
We also show that this hierarchy of exponential families is noncollapsing with respect to graphical models by providing a graphical model which is a CEF but not a LEF and a graphical model that is a SEF but not a CEF. Finally, we show how to compute the dimension of a stratified exponential family. These results are discussed in the context of model selection of graphical models.

62E10 Characterization and structure theory of statistical distributions
62H05 Characterization and structure theory for multivariate probability distributions; copulas
62-09 Graphical methods in statistics (MSC2010)
Full Text: DOI
[1] Abramson, B., Brown, J., Edwards, W., Murphy, A. and Winkler, R. (1996). Hailfinder: a Bayesian system for forecastingsevere weather. Internat. J. Forecasting 12 57-71.
[2] Akbulut, S. and King, H. (1992). Topology of Real Algebraic Sets. Springer, New York. · Zbl 0808.14045
[3] Andersson, S., Madigan, D. and Perlman, M. (1996). An alternative Markov property for chain graphs. Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence 40-48. Morgan Kaufmann, San Francisco.
[4] Bamber, D. and van Santen, J. (1985). How many parameters can a model have and still be testable? J. Math. Psych. 29 443-473. · Zbl 0625.62104 · doi:10.1016/0022-2496(85)90005-7
[5] Barndorff-Nielsen, O. (1978). Information and Exponential Families. Wiley, New York. · Zbl 0387.62011
[6] Benedetti, R. and Risler, J. (1990). Real Algebraic and Semialgebraic Sets. Hermann, Paris. · Zbl 0694.14006
[7] Berzuini, C., Bellazzi, R., Quaglini, S. and Speigelhalter, D. (1992). Bayesian networks for patient monitoring. Artificial Intelligence in Medicine 4 243-260.
[8] Br öcker, Th. and Jänich, K. (1982). Introduction to Differential Topology. Cambridge Univ. Press. · Zbl 0486.57001
[9] Chickering, D., Heckerman, D. and Meek, C. (1997). A Bayesian approach to learningBayesian networks with local structure. In Proceedings of Uncertainty and Artificial Intelligence 80-89. Morgan Kaufmann, San Francisco.
[10] Cowell, R., Dawid, A. P., Lauritzen, S. and Spiegelhalter, D. (1999). Probabilistic Networks and Expert Systems (Statistics for Engineering and Information Science). Springer, New York. · Zbl 0937.68121
[11] Efron, B. (1978). The geometry of exponential families. Ann. Statist. 6 362-376. · Zbl 0436.62027 · doi:10.1214/aos/1176344130
[12] Eizirik, L., Barbosa, V. and Mendes, S. (1993). A Bayesian-network approach to lexical disambiguation. Cognitive Science 17 257-283.
[13] Fraley, C. and Raftery, A. (1998). How many clusters? Which clusteringmethod? Answers via model-based cluster analysis. Computer Journal 41 578-588. · Zbl 0920.68038 · doi:10.1093/comjnl/41.8.578 · www3.oup.co.uk
[14] Frey, B. ed. (1978). Graphical Models for Machine Learning and Digital Communication. MIT Press.
[15] Friedman, N. and Goldszmidt, M. (1996). LearningBayesian networks with local structure. In Poceedings of Twelfth Conference on Uncertainty in Artificial Intelligence 252-262. Morgan Kaufmann, San Francisco.
[16] Fung, B. and Favero, B. D. (1995). ApplyingBayesian networks to information retrieval. Comm. ACM 38 42-48.
[17] Gavard, L., Bhadeshia, H., MacKay, D. and Suzuki, S. (1996). Bayesian neural network model for austenite formation in steels. Materials Science and Technology 12 453-463.
[18] GEIGER, HECKERMAN, KING AND MEEK Geiger, D. and Heckerman, D. (1994). LearningGaussian networks. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence 235-243. Morgan Kaufmann, San Francisco.
[19] Geiger, D., Heckerman, D. and Meek, C. (1996). Asymptotic model selection for directed networks with hidden variables. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence 283-290. Morgan Kaufmann, San Francisco. · Zbl 0910.68177
[20] Geiger, D. and Meek, C. (1998). Graphical models and exponential families. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence 156-165. Morgan Kaufmann, San Francisco.
[21] Goodman, L. (1974). Exploratory latent structure analysis usingboth identifiable and unidentifiable models. Biometrika 61 215-231. JSTOR: · Zbl 0281.62057 · doi:10.1093/biomet/61.2.215 · links.jstor.org
[22] Harris, N. (1990). Probabilistic belief networks for genetic counseling. Computer Methods and Programs in Biomedicine 32 37-44.
[23] Haughton, D. (1988). On the choice of a model to fit data from an exponential family. Ann. Statist. 16 342-555. · Zbl 0657.62037 · doi:10.1214/aos/1176350709
[24] Heckerman, D. and Breese, J. (1996). Causal independence for probability assessment and inference usingBayesian networks. IEEE Systems, Man, and Cybernetics 26 826-831.
[25] Heckerman, D., Breese, J. and Rommelse, K. (1995). Decision-theoretic troubleshooting. Comm. ACM 38 49-57.
[26] Henrion, M. (1987). Some practical issues in constructingbelief networks. In Proceedings of the Third Workshop on Uncertainty in Artificial Intelligence 132-139. Association for Uncertainty in Artificial Intelligence, Mountain View, CA.
[27] Kass, R. and Vos, P. (1997). Geometrical Foundations of Asymptotic Inference. Wiley, New York. · Zbl 0880.62005
[28] Koster, J. (1997). Gibbs and Markov properties of graphs. Ann. Math. Artificial Intelligence 21 13-26. · Zbl 0895.68115 · doi:10.1023/A:1018948915264
[29] Kumar, V. and Desai, U. (1996). Image interpretation using Bayesian networks. IEEE Trans. Pattern Analysis and Machine Intelligence 18 74-77.
[30] Lauritzen, S. (1996). Graphical Models. Claredon Press, Oxford. · Zbl 0907.62001
[31] Lauritzen, S. and Wermuth, N. (1989). Graphical models for association between variables, some of which are qualitative and some quantitative. Ann. Statist. 17 31-57. · Zbl 0669.62045 · doi:10.1214/aos/1176347003
[32] McEliece, R., MacKay, D., and Cheng, J. (1998). Trubo decodingas an instance of Pearl’s belief propagation algorithm. IEEE Journal on Selected Areas in Communication 16 140-152.
[33] Meek, C. and Heckerman, D. (1997). Structure and parameter learningfor causal independence and causal interaction models. In Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence 366-375. Morgan Kaufmann, San Francisco. Olesen, K., Kjaerulff, U., Jensen, F., Jensen, F., Flack, B., Andreassen, S. and Andersen, S.
[34] . A MUNIN network for the median nerve: A case study on loops. Applied Artificial Intelligence 3 385-404.
[35] Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco. · Zbl 0746.68089
[36] Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge Univ. Press. · Zbl 0959.68116
[37] Sarkar, S. and Boyer, K. (1993). Integration, inference, and management of spatial information usingBayesian networks: Perceptual organization. IEEE Trans. Pattern Analysis and Machine Intelligence 15 256-274.
[38] Schwarz, G. (1978). Estimatingthe dimension of a model. Ann. Statist. 6 461-464. · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[39] Settimi, R. and Smith, J. (1998). On the geometry of Bayesian graphical models with hidden variables. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence 472-479. Morgan Kaufmann, San Francisco.
[40] Shachter, R. and Kenley, R. (1986). Gaussian influence diagrams. Management Science 35 527-550. Shwe, M., Middleton, B., Heckerman, D., Henrion, M., Horvitz, E., Lehmann, H. and Cooper, G. (1991). Probabilistic diagnosis using a reformulation of the INTERNIST1/QMR knowledge base I. The probabilistic model and inference algorithms. Methods in Information and Medicine 30 241-250.
[41] Spiegelhalter, D. and Thomas, A. (1998). Graphical modelingfor complex stochastic systems: The BUGS project. IEEE Intelligent Systems and Their Applications 13 14-15.
[42] Spirtes, P., Glymour, C. and Scheines, R. (1993). Causation, Prediction, and Search. Springer, New York. · Zbl 0806.62001
[43] Spirtes, P., Richardson, T. and Meek, C. (1997). The dimensionality of mixed ancestral graphs. Technical Report CMU-PHIL-83, Dept. Philosophy, Carnegie Mellon Univ.
[44] Spivak, M. (1965). Calculus on Manifolds. Addison-Wesley, New York. · Zbl 0141.05403
[45] Turtle, H. and Croft, B. (1991). Evaluation of an inference network-based retrieval model. ACM Trans. Information Systems 9 1878-222.
[46] Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley, New York. Department of Computer Science Technion-Israel Institute of Technology Haifa 32000 Israel E-mail: dang@cs.technion.ac.il · Zbl 0732.62056
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.