Statistical inference based on ranks.

*(English)*Zbl 0592.62031
Wiley Series in Probability and Mathematical Statistics. Probability and Mathematical Statistics. New York etc.: John Wiley & Sons. XVII, 323 p. (1984).

The major goal of this book is to develop a coherent and unified set of statistical methods (based on ranks) for carrying out inferences in various experimental situations. The book begins with the simple one- sample location model and progresses through the two-sample location model, the one- and two-way layouts to the general linear model. A final chapter develops methods for the multivariate location models. In all cases, testing and estimation are developed together as an interconnected set of methods for each model.

The basic tools and results from mathematical statistics are introduced as they are needed. The tools fall into two groups: tools to assess the statistical properties of the procedures, and tools to assess the stability properties. In the former case, the major tools are asymptotic relative efficiency and asymptotic local power. In the latter case, the main tools are the influence curve and the tolerance (breakdown). The stability criteria are central to the modern theory of robust statistical methods. The statistical efficiency properties are described for all the methods introduced in the book. The robustness properties are developed extensively in the one-sample location model and discussed briefly for the simple regression model. The goal is to help the student develop a working knowledge of both efficiency and robustness.

The text is organized around statistical models because this is the context in which statistical inference and data analysis are carried out. By acquiring a firm understanding of the methods and their properties in the simple models, the reader will be prepared to deal with the methods in the general linear model. We provide a rigorous development of methods based on rank sums. These methods include the Wilcoxon signed rank statistic, the Mann-Whitney-Wilcoxon statistic, the Kruskal-Wallis statistic, the Friedman statistic, and rank tests based on residuals in the linear model. The more general sums of rank scores are discussed and integrated into the discussion with references to the sources of their rigorous development. We have concentrated on rank sums for two reasons: they are the most commonly used by researchers, and their properties can be explored with the least amount of mathematical sophistication.

The linear model, which includes multiple regression and analysis of variance designs, is not generally treated systematically in texts on nonparametric statistics, applied or theoretical. This is a serious omission since most data analysis is carried out in the context of the linear model. Part of the reason that serious researchers have not used nonparametric methods more extensively is the lack of their systematic development for the linear model. The present text provides a development of these methods. Furthermore, in the near future, there will be statistical software available to implement these methods. The Minitab statistical computing system, which already includes the major nonparametric methods for the simple designs, will include a rank regression command that will provide both rank tests and estimates. Hence, the procedures developed in the text will be fully operational and can be used by researchers for analysis in complex data sets.

The book contains many exercises and problems. Major results in exercises are explicitly presented. Thus, the equations are available to the reader who does not want to take the time to derive them. An appendix of important results (without proofs) from the main body of mathematical statistics is provided. All major procedures are illustrated on data sets.

The first three chapters cover the one- and two-sample location models. Finite sample and asymptotic distribution theory is developed. Tests, point estimates, and confidence intervals are derived. Their properties are explored through asymptotic efficiency, influence curves, and tolerance (breakdown). This material can be covered in a one-semester course at the first- or second-year graduate level. The prerequisites are an introductory course in mathematical statistics and a course in advanced calculus. The material in chapter 5 requires a deeper background in statistics. The reader should have prior knowledge of the linear model in matrix notation.

The basic tools and results from mathematical statistics are introduced as they are needed. The tools fall into two groups: tools to assess the statistical properties of the procedures, and tools to assess the stability properties. In the former case, the major tools are asymptotic relative efficiency and asymptotic local power. In the latter case, the main tools are the influence curve and the tolerance (breakdown). The stability criteria are central to the modern theory of robust statistical methods. The statistical efficiency properties are described for all the methods introduced in the book. The robustness properties are developed extensively in the one-sample location model and discussed briefly for the simple regression model. The goal is to help the student develop a working knowledge of both efficiency and robustness.

The text is organized around statistical models because this is the context in which statistical inference and data analysis are carried out. By acquiring a firm understanding of the methods and their properties in the simple models, the reader will be prepared to deal with the methods in the general linear model. We provide a rigorous development of methods based on rank sums. These methods include the Wilcoxon signed rank statistic, the Mann-Whitney-Wilcoxon statistic, the Kruskal-Wallis statistic, the Friedman statistic, and rank tests based on residuals in the linear model. The more general sums of rank scores are discussed and integrated into the discussion with references to the sources of their rigorous development. We have concentrated on rank sums for two reasons: they are the most commonly used by researchers, and their properties can be explored with the least amount of mathematical sophistication.

The linear model, which includes multiple regression and analysis of variance designs, is not generally treated systematically in texts on nonparametric statistics, applied or theoretical. This is a serious omission since most data analysis is carried out in the context of the linear model. Part of the reason that serious researchers have not used nonparametric methods more extensively is the lack of their systematic development for the linear model. The present text provides a development of these methods. Furthermore, in the near future, there will be statistical software available to implement these methods. The Minitab statistical computing system, which already includes the major nonparametric methods for the simple designs, will include a rank regression command that will provide both rank tests and estimates. Hence, the procedures developed in the text will be fully operational and can be used by researchers for analysis in complex data sets.

The book contains many exercises and problems. Major results in exercises are explicitly presented. Thus, the equations are available to the reader who does not want to take the time to derive them. An appendix of important results (without proofs) from the main body of mathematical statistics is provided. All major procedures are illustrated on data sets.

The first three chapters cover the one- and two-sample location models. Finite sample and asymptotic distribution theory is developed. Tests, point estimates, and confidence intervals are derived. Their properties are explored through asymptotic efficiency, influence curves, and tolerance (breakdown). This material can be covered in a one-semester course at the first- or second-year graduate level. The prerequisites are an introductory course in mathematical statistics and a course in advanced calculus. The material in chapter 5 requires a deeper background in statistics. The reader should have prior knowledge of the linear model in matrix notation.

##### MSC:

62Gxx | Nonparametric inference |

62G05 | Nonparametric estimation |

62-01 | Introductory exposition (textbooks, tutorial papers, etc.) pertaining to statistics |

62J05 | Linear regression; mixed models |

62G10 | Nonparametric hypothesis testing |

62H15 | Hypothesis testing in multivariate analysis |

62J99 | Linear inference, regression |

62G20 | Asymptotic properties of nonparametric inference |

62E20 | Asymptotic distribution theory in statistics |