Robust statistics. The approach based on influence functions.

*(English)*Zbl 0593.62027
Wiley Series in Probability and Mathematical Statistics. Probability and Mathematical Statistics. New York etc.: John Wiley & Sons. XXI, 502 p. £38.45 (1986).

This book may be considered an important alternative to P. J. Huber’s ”Robust Statistics” (1981; Zbl 0536.62025). While the latter one is more or less concerned only with the theory of robust estimation as in the case of location, scale and regression parameters (M-, L-, and R- estimators) and of covariance and correlation matrices the present book holds the balance between both test and estimation procedures in statistical inference.

There exists a great variety of approaches towards the robustness problem, e.g. general stability, characterizations, adaptivity, and Bayesian robustness. Huber developed two major theoretical approaches, the minimax approach to robust estimation and the capacities approach to robust testing and confidence intervals. The book under review contains another approach based on influence functions which has also been called the ”infinitesimal approach”, although it includes an important global robustness aspect, too, namely the ”breakdown point”. Whereas qualitative robustness is on principle related to continuity of a functional, the influence function IF(x,T,F) - as a quantitative measure of robustness - corresponds to the first derivative of a functional T at an underlying distribution function F. It describes the (approximate) effect of an additional observation in any point x on T.

The concept of influence function goes back to the first author [see ”Contribution to the theory of robust estimation.” Ph.D. Dis., Univ. Calif. Berkeley (1968)]. P. J. Rousseeuw and E. Ronchetti [J. Comput. Appl. Math. 7, 161-166 (1981; Zbl 0472.62046) and ”The influence curve for tests”, Res. Rep. 21, Fachgruppe Stat., ETH Zürich (1979)] have generalized it to tests by adapting the influence function to non- Fisher-consistent functionals in order to investigate the local robustness of test statistics.

Chapter 1 gives a very detailed introduction and motivation to robust statistics (77 pages). It discusses the place and aims of robust statistics, the question ”why robust statistics”, the different approaches towards a theory of robustness, and the problem of outliers in connection with a lot of empirical studies from the literature.

Chapter 2 contains one-dimensional estimators, the classes of M-, L-, R- estimators, and other types of estimators. The influence function IF(x,T,F) as a local concept is introduced and calculated for the estimators above. Some robustness measures are derived from the influence function, e.g. the gross-error-sensitivity \(\gamma^*=\sup_{x}| IF(x,T,F)|\). The break-down point introduced measures the global reliability of an estimator and describes up to what distance from the model distribution the estimator still gives some relevant information. While the influence function provides a description of the local robustness of the asymptotic value of an estimator, the change-of- variance function is related to the local robustness of the asymptotic variance of an estimator. This function is also studied in this chapter.

In chapter 3 the influence function for one-dimensional tests is introduced and then investigated in the one- and two-sample case. Moreover, the connections between this approach and that of D. Lambert [J. Am. Stat. Assoc. 76, 649-657 (1981; Zbl 0472.62047)] and W. J. R. Eplett [J. R. Stat. Soc., Ser. B 42, 64-70 (1980; Zbl 0421.62028)] is presented. Chapter 4 deals with multi-dimensional estimators including the concept of invariance and equivariance and chapter 5 deals with the estimation of covariance matrices and multivariate location.

Chapters 6 and 7 present robust estimation and robust testing in linear models, respectively. In chapter 8 complements and an outlook are given, e.g. on serial correlation robustness in time series and on small-sample asymptotics; in addition some frequent misunderstandings about robust statistics are discussed.

The book includes a large number of examples and problems to each chapter as well as an extensive list of references.

There exists a great variety of approaches towards the robustness problem, e.g. general stability, characterizations, adaptivity, and Bayesian robustness. Huber developed two major theoretical approaches, the minimax approach to robust estimation and the capacities approach to robust testing and confidence intervals. The book under review contains another approach based on influence functions which has also been called the ”infinitesimal approach”, although it includes an important global robustness aspect, too, namely the ”breakdown point”. Whereas qualitative robustness is on principle related to continuity of a functional, the influence function IF(x,T,F) - as a quantitative measure of robustness - corresponds to the first derivative of a functional T at an underlying distribution function F. It describes the (approximate) effect of an additional observation in any point x on T.

The concept of influence function goes back to the first author [see ”Contribution to the theory of robust estimation.” Ph.D. Dis., Univ. Calif. Berkeley (1968)]. P. J. Rousseeuw and E. Ronchetti [J. Comput. Appl. Math. 7, 161-166 (1981; Zbl 0472.62046) and ”The influence curve for tests”, Res. Rep. 21, Fachgruppe Stat., ETH Zürich (1979)] have generalized it to tests by adapting the influence function to non- Fisher-consistent functionals in order to investigate the local robustness of test statistics.

Chapter 1 gives a very detailed introduction and motivation to robust statistics (77 pages). It discusses the place and aims of robust statistics, the question ”why robust statistics”, the different approaches towards a theory of robustness, and the problem of outliers in connection with a lot of empirical studies from the literature.

Chapter 2 contains one-dimensional estimators, the classes of M-, L-, R- estimators, and other types of estimators. The influence function IF(x,T,F) as a local concept is introduced and calculated for the estimators above. Some robustness measures are derived from the influence function, e.g. the gross-error-sensitivity \(\gamma^*=\sup_{x}| IF(x,T,F)|\). The break-down point introduced measures the global reliability of an estimator and describes up to what distance from the model distribution the estimator still gives some relevant information. While the influence function provides a description of the local robustness of the asymptotic value of an estimator, the change-of- variance function is related to the local robustness of the asymptotic variance of an estimator. This function is also studied in this chapter.

In chapter 3 the influence function for one-dimensional tests is introduced and then investigated in the one- and two-sample case. Moreover, the connections between this approach and that of D. Lambert [J. Am. Stat. Assoc. 76, 649-657 (1981; Zbl 0472.62047)] and W. J. R. Eplett [J. R. Stat. Soc., Ser. B 42, 64-70 (1980; Zbl 0421.62028)] is presented. Chapter 4 deals with multi-dimensional estimators including the concept of invariance and equivariance and chapter 5 deals with the estimation of covariance matrices and multivariate location.

Chapters 6 and 7 present robust estimation and robust testing in linear models, respectively. In chapter 8 complements and an outlook are given, e.g. on serial correlation robustness in time series and on small-sample asymptotics; in addition some frequent misunderstandings about robust statistics are discussed.

The book includes a large number of examples and problems to each chapter as well as an extensive list of references.

Reviewer: H.Büning

##### MSC:

62F35 | Robustness and adaptive procedures (parametric inference) |

62F03 | Parametric hypothesis testing |

62F10 | Point estimation |

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |