The statistical analysis of categorical data.

*(English)*Zbl 0724.62004
Berlin etc.: Springer-Verlag. ix, 523 p. DM 178.00 (1990).

This book, with more than five hundred pages, is a very important one as it offers a large panorama of the knowledge on Categorical Data. It is written as a textbook and requires an ordinary level in mathematics and a classical culture in methods of statistical inference. Nevertheless, it contains a presentation of recent studies on Categorical Data and therefore will be a reference book for the following next years. Moreover, it is very rich in examples, almost all of them issued from the Danish Welfare Study and exhaustively treated. All the calculus of these examples and those of the data the reader would treat, can be performed on desk calculators or with a usual standard package as SAS, BMDP, SPSS or GENSTAT. The more recent methods require the specialized package CATANA developed for personal computers by the author and S. V. Andersen.

After two chapters of recall on classical tools of statistical inference, the author emphasizes on the models for categorical data as log-linear models and their relations with exponential families, multinomial distributions, logistic models and generalized linear models. The four following chapters are devoted to contingency tables: two or three or multi-way models with a special look to incomplete tables. In each case, interactions are specially investigated and a special discussion on their interpretation is developed. The different ways to represent or study interactions are at greater length discussed: log-linear parameterization, graphical models, decomposable models, association graphs. At each time an example illustrates the talk and many exercises on real data are offered for personal training and reflections. The problem of the detection of model departure is studied in all cases of contingency tables.

The five last chapters are devoted to more or less recent developments on categorical data: logit models, logistic regression models, models for the interactions (symmetry models, marginal homogeneity, models for mobility tables, association models, raw-column association models, log- linear association models), correspondance analysis and latent structure analysis. For each of these subjects, the author presents the complete theory, a few examples and many comments on their uses. Many of these models have been studied in Denmark and the author was an actor in their development. Correspondance analysis was, for a great part, developed in France.

As said at the beginning, this referee thinks that this book will be an essential reference for the next years for all of the statisticians who will have to treat categorical data or lectures to prepare.

After two chapters of recall on classical tools of statistical inference, the author emphasizes on the models for categorical data as log-linear models and their relations with exponential families, multinomial distributions, logistic models and generalized linear models. The four following chapters are devoted to contingency tables: two or three or multi-way models with a special look to incomplete tables. In each case, interactions are specially investigated and a special discussion on their interpretation is developed. The different ways to represent or study interactions are at greater length discussed: log-linear parameterization, graphical models, decomposable models, association graphs. At each time an example illustrates the talk and many exercises on real data are offered for personal training and reflections. The problem of the detection of model departure is studied in all cases of contingency tables.

The five last chapters are devoted to more or less recent developments on categorical data: logit models, logistic regression models, models for the interactions (symmetry models, marginal homogeneity, models for mobility tables, association models, raw-column association models, log- linear association models), correspondance analysis and latent structure analysis. For each of these subjects, the author presents the complete theory, a few examples and many comments on their uses. Many of these models have been studied in Denmark and the author was an actor in their development. Correspondance analysis was, for a great part, developed in France.

As said at the beginning, this referee thinks that this book will be an essential reference for the next years for all of the statisticians who will have to treat categorical data or lectures to prepare.

Reviewer: B.Van Cutsem (Grenoble)

##### MSC:

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |

62H17 | Contingency tables |

62J99 | Linear inference, regression |