##
**Robust nonparametric statistical methods.**
*(English)*
Zbl 0887.62056

Kendall’s Library of Statistics. 5. London: Arnold. New York, NY: Wiley. xiv, 467 p. (1998).

The authors of this textbook are known from the literature by a lot of excellent contributions to robust and nonparametric inference. To a certain extend this book is based on research papers by the authors that appeared in the period 1978-1997. Although about thirty real data sets are presented and discussed, the book is certainly not an “applied” one, it requires a high level of mathematical knowledge. It is written for a graduate student of mathematics and researchers but not for practitioners.

The book covers a wide range of statistical procedures in estimating and testing concerning location models, regression models, including experimental designs and multivariate models. The tests and estimates are derived in general from a geometric point of view. The Euclidean norm is replaced by a weighted \(L_1\) norm where the weights are chosen as functions of ranks. This approach results in rank-based methods depending on the choice of weights. From the point of modern robustness theory, contrary to least-squares estimates, these rank-based methods have bounded influence functions and positive breakdown points.

The book consists of six chapters and an appendix with basic results for asymptotic theory. Chapter 1 deals with norm-based estimates, confidence intervals and tests such as the Wilcoxon signed rank test in the one-sample location model. Robustness properties described by the breakdown value and influence function are investigated. Asymptotic theory and efficiency results are presented, too. Chapter 2 is concerned with the two-sample problem for comparing two populations which differ in location or scale, including the Behrens-Fisher problem. As in Chapter 1, robustness properties and efficiency results of some estimates and tests are presented, e.g., for the Hodges-Lehmann estimator and the tests of Mann-Whitney-Wilcoxon, Mathisen, Mood and Savage. Here the procedures are based on special types of a so-called pseudo-norm (p. 71). Lehmann alternatives as a special class of lifetime models are also considered.

In Chapter 3, the theory of rank-based analysis of a general linear model is discussed. The analysis consists of estimating, testing and diagnostic tools for checking the adequacy of fit of the model and outlier detection. General rank scores are chosen in order to allow symmetric and asymmetric error distributions. Special scores which are suitable for lifetime models are also discussed. The last section is concerned with the correlation model. Chapter 4 deals with rank-based inference for experimental designs based on the theory developed in Chapter 3. The main focus is on factorial-type designs and on analysis of covariance. Estimation of effects, tests of linear hypotheses concerning effects and multiple comparison procedures are discussed. A comparison between the rank-based method and the rank-transform (RT) method is presented as well. It is shown that often the RT does not work well for testing hypotheses in factorial designs.

In Chapter 5 \(R\)-estimates with bounded influence functions and (high) positive breakdown values as well as diagnostics that detect differences between fits are presented. The final Chapter 6 is concerned with multivariate models, especially with the bivariate location model. The first part of the chapter deals with one-sample estimates and tests based on vector signs and ranks. Both rotational and affine invariant methods are developed followed by robustness results on the estimates derived before. Furthermore, the bivariate two-sample location model and multi-sample location models are treated. This section is mainly concerned with componentwise methods.

This excellent book has a lot of nice aspects; first of all, the consequently chosen norm-based approach for deriving estimates and tests, further, the large number of real data examples in order to illustrate all major methodology where all the data sets can be obtained from the Web site. The reader can contact Minitab at info@minitab.com and request a technical report that describes the rank regression command RREG. Plenty of exercises (up to 48) in each chapter make the book well-suited as a text book.

From my point of view the book has only one major deficiency: It does not contain any adaptive procedure, neither adaptive estimates nor adaptive tests, which have been discussed in the literature in the last 25 years and which have become powerful tools in robust and nonparametric inference. Two minor objections should be made. Firstly, the two-sample location problem is already treated at the end of Chapter 1 (one-sample problems) although there is an own chapter for the two-sample analysis (Chapter 2). Secondly, in the paper of M. A. Fligner and G. E. Policello, J. Am. Stat. Assoc. 76, No. 373, 162-168 (1981), the general rank test method for the Behrens-Fisher problem is applied to the Mann-Whitney-Wilcoxon statistic and not to Mood’s statistic as stated in the book (p. 130).

Nevertheless, this fine textbook is a pleasure to read and is a valuable stimulus for further research in robust nonparametric inference. It is highly recommended for all readers who possess the necessary mathematical skills. In the Preface the authors state that “This book is based on the premise that nonparametric or rank-based statistical methods are a superior choice in many data-analytic situations”. I think this book is the best support of this premise.

The book covers a wide range of statistical procedures in estimating and testing concerning location models, regression models, including experimental designs and multivariate models. The tests and estimates are derived in general from a geometric point of view. The Euclidean norm is replaced by a weighted \(L_1\) norm where the weights are chosen as functions of ranks. This approach results in rank-based methods depending on the choice of weights. From the point of modern robustness theory, contrary to least-squares estimates, these rank-based methods have bounded influence functions and positive breakdown points.

The book consists of six chapters and an appendix with basic results for asymptotic theory. Chapter 1 deals with norm-based estimates, confidence intervals and tests such as the Wilcoxon signed rank test in the one-sample location model. Robustness properties described by the breakdown value and influence function are investigated. Asymptotic theory and efficiency results are presented, too. Chapter 2 is concerned with the two-sample problem for comparing two populations which differ in location or scale, including the Behrens-Fisher problem. As in Chapter 1, robustness properties and efficiency results of some estimates and tests are presented, e.g., for the Hodges-Lehmann estimator and the tests of Mann-Whitney-Wilcoxon, Mathisen, Mood and Savage. Here the procedures are based on special types of a so-called pseudo-norm (p. 71). Lehmann alternatives as a special class of lifetime models are also considered.

In Chapter 3, the theory of rank-based analysis of a general linear model is discussed. The analysis consists of estimating, testing and diagnostic tools for checking the adequacy of fit of the model and outlier detection. General rank scores are chosen in order to allow symmetric and asymmetric error distributions. Special scores which are suitable for lifetime models are also discussed. The last section is concerned with the correlation model. Chapter 4 deals with rank-based inference for experimental designs based on the theory developed in Chapter 3. The main focus is on factorial-type designs and on analysis of covariance. Estimation of effects, tests of linear hypotheses concerning effects and multiple comparison procedures are discussed. A comparison between the rank-based method and the rank-transform (RT) method is presented as well. It is shown that often the RT does not work well for testing hypotheses in factorial designs.

In Chapter 5 \(R\)-estimates with bounded influence functions and (high) positive breakdown values as well as diagnostics that detect differences between fits are presented. The final Chapter 6 is concerned with multivariate models, especially with the bivariate location model. The first part of the chapter deals with one-sample estimates and tests based on vector signs and ranks. Both rotational and affine invariant methods are developed followed by robustness results on the estimates derived before. Furthermore, the bivariate two-sample location model and multi-sample location models are treated. This section is mainly concerned with componentwise methods.

This excellent book has a lot of nice aspects; first of all, the consequently chosen norm-based approach for deriving estimates and tests, further, the large number of real data examples in order to illustrate all major methodology where all the data sets can be obtained from the Web site. The reader can contact Minitab at info@minitab.com and request a technical report that describes the rank regression command RREG. Plenty of exercises (up to 48) in each chapter make the book well-suited as a text book.

From my point of view the book has only one major deficiency: It does not contain any adaptive procedure, neither adaptive estimates nor adaptive tests, which have been discussed in the literature in the last 25 years and which have become powerful tools in robust and nonparametric inference. Two minor objections should be made. Firstly, the two-sample location problem is already treated at the end of Chapter 1 (one-sample problems) although there is an own chapter for the two-sample analysis (Chapter 2). Secondly, in the paper of M. A. Fligner and G. E. Policello, J. Am. Stat. Assoc. 76, No. 373, 162-168 (1981), the general rank test method for the Behrens-Fisher problem is applied to the Mann-Whitney-Wilcoxon statistic and not to Mood’s statistic as stated in the book (p. 130).

Nevertheless, this fine textbook is a pleasure to read and is a valuable stimulus for further research in robust nonparametric inference. It is highly recommended for all readers who possess the necessary mathematical skills. In the Preface the authors state that “This book is based on the premise that nonparametric or rank-based statistical methods are a superior choice in many data-analytic situations”. I think this book is the best support of this premise.

Reviewer: H.Büning (Berlin)

### MSC:

62G35 | Nonparametric robustness |

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |

62G05 | Nonparametric estimation |

62G10 | Nonparametric hypothesis testing |