Categorical data analysis.
2nd ed.

*(English)*Zbl 1018.62002
Wiley Series in Probability and Mathematical Statistics. Applied Probability and Statistics. Chichester: Wiley. xvi, 710 p. (2002).

[For the review of the first edition from 1990 see Zbl 0716.62001]

Categorical data (CD) are observations of a random experiment the response of which falls into one of several qualitatively or quantitatively distinct categories. Often one speaks that the observations are measured in a scale. The scales in which CD are measured are the nominal and ordinal scale. Categorical data analysis (CDA) is as old as statistics, one can say, even as probability theory. Bernoulli’s scheme with the responses “success” and “failure” is an eloquent example for it. So the book, in Chapter 1, begins with distributions which occur in CDA as binomial, Poisson, etc. Data analysis is understood here in a broad sense, not only by describing data but mainly by inferential statistics.

A fundamental tool for CDA are contingency tables displaying relationships between categorical variables. They are described in Chapter 2; connected estimators and tests follow in Chapter 3. The categorical counterpart of linear models in statistics are some generalized linear models. The loglinear models, the most important ones, and some generalizations are introduced in Chapter 4. CDA for binary data leads to the exceedingly successful logistic regression treated in Chapters 5 and 6. Multicategorical responses lead to logit models which are the subject of Chapter 7. In Chapter 8, the author turns to loglinear models for contingency tables, and comparisons with logits are made in Chapter 9. Connected samples, i.e., matched pairs, are the subject of Chapter 10, and more elemental matched sets that of Chapter 11. Mixed models mean that there are fixed as well as random effects of responses. They are considered in Chapters 12 and 13. The final 3 chapters deal with asymptotics, non-usual models and a summary of the historical development of the discipline.

There are still two appendices, first on using software for CDA with emphasis on SAS, also with hints to the internet. Secondly a small table of \(\chi^2\) quantiles.

The book is, simply expressed, grand. It exhausts all what is investigated in CDA. It is written in a highly scientific but vivid style, intelligible for all researchers in that field. At numerous places, there are hints for applications of CDA in natural, medical and social sciences. Besides numerous references to the literature, an author and a subject index, there is a longer special index to the treated examples. Every chapter ends with hints to original papers and advanced problems for solutions in applications as well as from theory and methods. The present second edition is significantly enlarged w.r.t. the first one. The referee warmly recommends the book as a compendium for relevant research.

Categorical data (CD) are observations of a random experiment the response of which falls into one of several qualitatively or quantitatively distinct categories. Often one speaks that the observations are measured in a scale. The scales in which CD are measured are the nominal and ordinal scale. Categorical data analysis (CDA) is as old as statistics, one can say, even as probability theory. Bernoulli’s scheme with the responses “success” and “failure” is an eloquent example for it. So the book, in Chapter 1, begins with distributions which occur in CDA as binomial, Poisson, etc. Data analysis is understood here in a broad sense, not only by describing data but mainly by inferential statistics.

A fundamental tool for CDA are contingency tables displaying relationships between categorical variables. They are described in Chapter 2; connected estimators and tests follow in Chapter 3. The categorical counterpart of linear models in statistics are some generalized linear models. The loglinear models, the most important ones, and some generalizations are introduced in Chapter 4. CDA for binary data leads to the exceedingly successful logistic regression treated in Chapters 5 and 6. Multicategorical responses lead to logit models which are the subject of Chapter 7. In Chapter 8, the author turns to loglinear models for contingency tables, and comparisons with logits are made in Chapter 9. Connected samples, i.e., matched pairs, are the subject of Chapter 10, and more elemental matched sets that of Chapter 11. Mixed models mean that there are fixed as well as random effects of responses. They are considered in Chapters 12 and 13. The final 3 chapters deal with asymptotics, non-usual models and a summary of the historical development of the discipline.

There are still two appendices, first on using software for CDA with emphasis on SAS, also with hints to the internet. Secondly a small table of \(\chi^2\) quantiles.

The book is, simply expressed, grand. It exhausts all what is investigated in CDA. It is written in a highly scientific but vivid style, intelligible for all researchers in that field. At numerous places, there are hints for applications of CDA in natural, medical and social sciences. Besides numerous references to the literature, an author and a subject index, there is a longer special index to the treated examples. Every chapter ends with hints to original papers and advanced problems for solutions in applications as well as from theory and methods. The present second edition is significantly enlarged w.r.t. the first one. The referee warmly recommends the book as a compendium for relevant research.

Reviewer: Peter Neumann (Dresden)

##### MSC:

62-07 | Data analysis (statistics) (MSC2010) |

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |

62J12 | Generalized linear models (logistic models) |

62H17 | Contingency tables |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

62P25 | Applications of statistics to social sciences |