##
**Multivariate estimation with high breakdown point.**
*(English)*
Zbl 0609.62054

Mathematical statistics and applications, Proc. 4th Pannonian Symp. Math. Stat., Bad Tatzmannsdorf/Austria 1983, Vol. B, 283-297 (1985).

[For the entire collection see Zbl 0583.00028.]

Suppose we have n data points in p dimensions, and we want to estimate their location using an estimator T that is affine equivariant, which means that \[ T(Ax_ 1+b,...,Ax_ n+b)=AT(x_ 1,...,x_ n)+b \] for all vectors b and nonsingular matrices A. The breakdown point of T is the smallest fraction of contaminated data that can carry T over all bounds. The breakdown point of least squares (the arithmetic mean) is 0, and for M-estimators it is at most \(1/(p+1)\). The ”outlyingness-weighted mean” of W. Stahel [Breakdown of covariance estimators. Res. Rep. 31, Fachgruppe Stat., ETH Zürich (1981)] and D. Donoho [Breakdown properties of multivariate location estimators. Ph. D. qualifying paper, Harvard Univ. (1982)] is affine equivariant and its breakdown point equals 50 %, the highest possible value.

The purpose of the present paper is to introduce another estimator with these properties, namely the center of the least-volume ellipsoid covering half of the data. A variant is to find that half of the data for which the empirical covariance matrix yields the smallest possible tolerance ellipsoids. These estimators automatically provide robust covariance estimators. They are inefficient at a Gaussian model, but this can easily be repaired by using a one-step reweighted least squares estimator afterwards.

More information regarding algorithms and applications of these high- breakdown estimators can be found in Chapter 7 of the author and A. Leroy, Robust regression and outlier detection. John Wiley, New York, to appear in September 1987. An important application is the identification of leverage points in regression analysis.

Suppose we have n data points in p dimensions, and we want to estimate their location using an estimator T that is affine equivariant, which means that \[ T(Ax_ 1+b,...,Ax_ n+b)=AT(x_ 1,...,x_ n)+b \] for all vectors b and nonsingular matrices A. The breakdown point of T is the smallest fraction of contaminated data that can carry T over all bounds. The breakdown point of least squares (the arithmetic mean) is 0, and for M-estimators it is at most \(1/(p+1)\). The ”outlyingness-weighted mean” of W. Stahel [Breakdown of covariance estimators. Res. Rep. 31, Fachgruppe Stat., ETH Zürich (1981)] and D. Donoho [Breakdown properties of multivariate location estimators. Ph. D. qualifying paper, Harvard Univ. (1982)] is affine equivariant and its breakdown point equals 50 %, the highest possible value.

The purpose of the present paper is to introduce another estimator with these properties, namely the center of the least-volume ellipsoid covering half of the data. A variant is to find that half of the data for which the empirical covariance matrix yields the smallest possible tolerance ellipsoids. These estimators automatically provide robust covariance estimators. They are inefficient at a Gaussian model, but this can easily be repaired by using a one-step reweighted least squares estimator afterwards.

More information regarding algorithms and applications of these high- breakdown estimators can be found in Chapter 7 of the author and A. Leroy, Robust regression and outlier detection. John Wiley, New York, to appear in September 1987. An important application is the identification of leverage points in regression analysis.

### MSC:

62F35 | Robustness and adaptive procedures (parametric inference) |

62H12 | Estimation in multivariate analysis |