×

Tree cumulants and the geometry of binary tree models. (English) Zbl 1235.62004

Summary: We investigate undirected discrete graphical tree models when all the variables in the system are binary, where leaves represent the observable variables and where all the inner nodes are unobserved. A novel approach based on the theory of partially ordered sets allows us to obtain a convenient parametrization of this model class. The construction of the proposed coordinate system mirrors the combinatorial definition of cumulants. A simple product-like form of the resulting parametrization gives insight into identifiability issues associated with this model class. In particular, we provide necessary and sufficient conditions for such a model to be identified up to the switching of labels of the inner nodes. When these conditions hold, we give explicit formulas for the parameters of the model. Whenever the model fails to be identified, we use the new parametrization to describe the geometry of the unidentified parameter space. We illustrate these results using a simple example.

MSC:

62A99 Foundational topics in statistics
62H99 Multivariate analysis
05C05 Trees
06A06 Partial orders, general
60J99 Markov processes
05C90 Applications of graph theory

References:

[1] Allman, E.S., Matias, C. and Rhodes, J.A. (2009). Identifiability of parameters in latent structure models with many observed variables. Ann. Statist. 37 3099-3132. · Zbl 1191.62003 · doi:10.1214/09-AOS689
[2] Auvray, V., Geurts, P. and Wehenkel, L. (2006). A semi-algebraic description of discrete naive Bayes models with two hidden classes. In Proc. Ninth International Symposium on Artificial Intelligence and Mathematics , Fort Lauderdale , Florida . Available at .
[3] Balakrishnan, N., Johnson, N.L. and Kotz, S. (1998). A note on relationships between moments, central moments and cumulants from multivariate distributions. Statist. Probab. Lett. 39 49-54. · Zbl 0903.62049 · doi:10.1016/S0167-7152(98)00027-3
[4] Chang, J.T. (1996). Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency. Math. Biosci. 137 51-73. · Zbl 1059.92504 · doi:10.1016/S0025-5564(96)00075-2
[5] Feller, W. (1971). An Introduction to Probability Theory and Its Applications. Vol. II , 2nd ed. New York: Wiley. · Zbl 0219.60003
[6] Geiger, D., Heckerman, D., King, H. and Meek, C. (2001). Stratified exponential families: Graphical models and model selection. Ann. Statist. 29 505-529. · Zbl 1012.62012 · doi:10.1214/aos/1009210550
[7] Lauritzen, S.L. (1996). Graphical Models. Oxford Statistical Science Series 17 . Oxford: Clarendon Press. · Zbl 0907.62001
[8] McCullagh, P. (1987). Tensor Methods in Statistics . London: Chapman & Hall. · Zbl 0732.62003
[9] Moulton, V. and Steel, M. (2004). Peeling phylogenetic ‘oranges’. Adv. in Appl. Math. 33 710-727. · Zbl 1067.92045 · doi:10.1016/j.aam.2004.03.003
[10] Pearl, J. and Tarsi, M. (1986). Structuring causal trees. J. Complexity 2 60-77. · Zbl 0589.68060 · doi:10.1016/0885-064X(86)90023-3
[11] Rota, G.C. (1964). On the foundations of combinatorial theory. I. Theory of Möbius functions. Probab. Theory Related Fields 2 340-368. · Zbl 0121.02406 · doi:10.1007/BF00531932
[12] Rota, G.C. and Shen, J. (2000). On the combinatorics of cumulants. J. Combin. Theory Ser. A 91 283-304. · Zbl 0978.05006 · doi:10.1006/jcta.1999.3017
[13] Rusakov, D. and Geiger, D. (2005). Asymptotic model selection for naive Bayesian networks. J. Mach. Learn. Res. 6 1-35 (electronic). · Zbl 1222.68294
[14] Semple, C. and Steel, M. (2003). Phylogenetics. Oxford Lecture Series in Mathematics and Its Applications 24 . Oxford: Oxford Univ. Press. · Zbl 1043.92026
[15] Settimi, R. and Smith, J.Q. (1998). On the geometry of Bayesian graphical models with hidden variables. In UAI (G.F. Cooper and M. Serafín, eds.) 472-479. San Francisco: Morgan Kaufmann.
[16] Settimi, R. and Smith, J.Q. (2000). Geometry, moments and conditional independence trees with hidden variables. Ann. Statist. 28 1179-1205. · Zbl 1105.62321 · doi:10.1214/aos/1015956712
[17] Speed, T.P. (1983). Cumulants and partition lattices. Austral. J. Statist. 25 378-388. · Zbl 0538.60023 · doi:10.1111/j.1467-842X.1983.tb00391.x
[18] Speicher, R. (1994). Multiplicative functions on the lattice of noncrossing partitions and free convolution. Math. Ann. 298 611-628. · Zbl 0791.06010 · doi:10.1007/BF01459754
[19] Spiegelhalter, D.J., Dawid, A.P., Lauritzen, S.L. and Cowell, R.G. (1993). Bayesian analysis in expert systems. Statist. Sci. 8 219-283. With comments and a rejoinder by the authors. · Zbl 0955.62523 · doi:10.1214/ss/1177010888
[20] Stanley, R.P. (2002). Enumerative Combinatorics. Volume I. Cambridge Studies in Advanced Mathematics 49 . Cambridge: Cambridge Univ. Press.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.