Graphical models in applied multivariate statistics.

*(English)*Zbl 0732.62056
Wiley Series in Probability and Mathematical Statistics. Chichester etc.: John Wiley & Sons Ltd. xiv, 448 p. £39.95; $ 73.50 (1990).

Having introduced some ideas being outlined in the course of the text, it really begins with chapter 2, where independence and conditional independence are discussed. Three main results are presented: The factorization lemma allowing a simple check of the independence, the reduction criterion worthwile for inferences abut marginal distributions, and a more technical one.

Chapter 3 starts with a short summary of graph theoretic background material. An independence graph of a set of random variables is then introduced as a set of pairwise conditional independents determinating the edges of the graph. The Markov property of the set of random variables follows from this graph, too. It plays a central role and is studied in detail.

The Kullback-Leibler information divergence is used as the unifying criterion for estimation of parameters and assessing the goodness of fit of graphical models to empirical data. Having it considered in chapter 4, the next chapter deals with the inverse variance, the last part of the machinery necessary for an adequate treatment of the main theme.

Strictly speaking, the discussion of the graphical models begins with chapter 6, dealing with Gaussian models. Such a model is defined by a multivariate Gaussian distribution with some restrictions on the covariance matrix. Since the likelihood function can be represented as an information divergence, the standard machinery of maximum likelihood estimation can be used to fit a graphical model to data.

Graphical log-linear models are investigated next. The set of all graphical models for categorical data is embedded within the set of all hierarchical log-linear models. The likelihood function is maximized by routine differentiation as well as using properties of the divergence. The central question of model selection is pursued in chapter 8. As with other approaches this is not an easy task. It is more complex than the model building process in multiple regression. Some hints for suitable strategies in contingency tables are possible. But the reader is left with a statement, finishing the paragraph on graphical model search strategies: “Even if the outcome does not lead to a simple model the procedure may suggest further models that are not members of the original lattice”.

The following chapters are devoted to special problems: 9. methods for sparse tables; 10. regression and graphical chain models; 11. models for mixed variables; 12. decomposition and decomposability.

Hints to existing software are given in an appendix.

Chapter 3 starts with a short summary of graph theoretic background material. An independence graph of a set of random variables is then introduced as a set of pairwise conditional independents determinating the edges of the graph. The Markov property of the set of random variables follows from this graph, too. It plays a central role and is studied in detail.

The Kullback-Leibler information divergence is used as the unifying criterion for estimation of parameters and assessing the goodness of fit of graphical models to empirical data. Having it considered in chapter 4, the next chapter deals with the inverse variance, the last part of the machinery necessary for an adequate treatment of the main theme.

Strictly speaking, the discussion of the graphical models begins with chapter 6, dealing with Gaussian models. Such a model is defined by a multivariate Gaussian distribution with some restrictions on the covariance matrix. Since the likelihood function can be represented as an information divergence, the standard machinery of maximum likelihood estimation can be used to fit a graphical model to data.

Graphical log-linear models are investigated next. The set of all graphical models for categorical data is embedded within the set of all hierarchical log-linear models. The likelihood function is maximized by routine differentiation as well as using properties of the divergence. The central question of model selection is pursued in chapter 8. As with other approaches this is not an easy task. It is more complex than the model building process in multiple regression. Some hints for suitable strategies in contingency tables are possible. But the reader is left with a statement, finishing the paragraph on graphical model search strategies: “Even if the outcome does not lead to a simple model the procedure may suggest further models that are not members of the original lattice”.

The following chapters are devoted to special problems: 9. methods for sparse tables; 10. regression and graphical chain models; 11. models for mixed variables; 12. decomposition and decomposability.

Hints to existing software are given in an appendix.

Reviewer: R.Schlittgen (Hamburg)

##### MSC:

62H17 | Contingency tables |

62H99 | Multivariate analysis |

62-01 | Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics |

05C90 | Applications of graph theory |