Model selection.

*(English)*Zbl 0665.62003
Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. New York etc.: John Wiley & Sons, Inc. XIV, 301 p.; $ 34.95 (1986).

Modern techniques of parametric model selection are described in relation to a wide variety of statistical problems. The objective is to choose from an appropriate approximating class a finite parameterization that best fits the data, according to some criterion. A great deal of reseach has been carried out on this topic in recent years, yet this appears to be the first wide-ranging, book-length account.

The volume begins with a discussion of notions of a statistical model, the “operating model” on the one hand, based on prior knowledge of the problem under investigation, and an “approximating family” of models on the other, which need not include the operating model and may involve a smaller number of parameters, possibly geared to the amount of data available. Strategies of selecting an approximating family are discussed, and the idea of the “discrepancy” between two models is introduced. Some forms of discrepancy are described and the role of the bootstrap and cross-validation in investigating discrepancy is discussed. These tools are developed in relation to a wide variety of statistical techniques: histograms, univariate probability modelling, simple linear regression with one or more replicates, multiple regression, the analysis of variance, the analysis of covariance, the analysis of proportions for binomial data, contingency tables, and time and frequency domain time series analysis.

The book is accessible to a quite wide statistical audience, being written in a readable style, with the relatively difficult technical material, relating to asymptotic properties, relegated to an appendix. An attractive feature of the book is the presence of many illustrative examples, using a variety of data sets.

The volume begins with a discussion of notions of a statistical model, the “operating model” on the one hand, based on prior knowledge of the problem under investigation, and an “approximating family” of models on the other, which need not include the operating model and may involve a smaller number of parameters, possibly geared to the amount of data available. Strategies of selecting an approximating family are discussed, and the idea of the “discrepancy” between two models is introduced. Some forms of discrepancy are described and the role of the bootstrap and cross-validation in investigating discrepancy is discussed. These tools are developed in relation to a wide variety of statistical techniques: histograms, univariate probability modelling, simple linear regression with one or more replicates, multiple regression, the analysis of variance, the analysis of covariance, the analysis of proportions for binomial data, contingency tables, and time and frequency domain time series analysis.

The book is accessible to a quite wide statistical audience, being written in a readable style, with the relatively difficult technical material, relating to asymptotic properties, relegated to an appendix. An attractive feature of the book is the presence of many illustrative examples, using a variety of data sets.

##### MSC:

62-01 | Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics |

62J05 | Linear regression; mixed models |

62J10 | Analysis of variance and covariance (ANOVA) |

62J99 | Linear inference, regression |