×

Multivariate spacings based on data depth. I: Construction of nonparametric multivariate tolerance regions. (English) Zbl 1360.62253

Summary: This paper introduces and studies multivariate spacings. The spacings are developed using the order statistics derived from data depth. Specifically, the spacing between two consecutive order statistics is the region which bridges the two order statistics, in the sense that the region contains all the points whose depth values fall between the depth values of the two consecutive order statistics. These multivariate spacings can be viewed as a data-driven realization of the so-called “statistically equivalent blocks”. These spacings assume a form of center-outward layers of “shells” (“rings” in the two-dimensional case), where the shapes of the shells follow closely the underlying probabilistic geometry. The properties and applications of these spacings are studied. In particular, the spacings are used to construct tolerance regions. The construction of tolerance regions is nonparametric and completely data driven, and the resulting tolerance region reflects the true geometry of the underlying distribution. This is different from most existing approaches which require that the shape of the tolerance region be specified in advance. The proposed tolerance regions are shown to meet the prescribed specifications, in terms of \(\beta\)-content and \(\beta\)-expectation. They are also asymptotically minimal under elliptical distributions. Finally, a simulation and comparison study on the proposed tolerance regions is presented.

MSC:

62H05 Characterization and structure theory for multivariate probability distributions; copulas
62G15 Nonparametric tolerance and confidence regions
62G30 Order statistics; empirical distribution functions
62G20 Asymptotic properties of nonparametric inference
62H12 Estimation in multivariate analysis

Software:

Qhull
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Barber, C. B., Dobkin, D. P. and Huhdanpaa, H. (1996). The Quickhull algorithm for convex hulls. ACM Trans. Math. Software 22 469-483. · Zbl 0884.65145
[2] Beirlant, J., Dierckx, G., Guillou, A. and Stacaronricacaron, C. (2002). On exponential representations of Log-spacings of extreme order statistics. Extremes 5 157-180. · Zbl 1036.62040
[3] Chatterjee, S. K. and Patra, N. K. (1980). Asymptotically minimal multivariate tolerance sets. Calcutta Statist. Assoc. Bull. 29 73-93. · Zbl 0453.62028
[4] Cressie, N. (1979). An optimal statistic based on higher order gaps. Biometrika 66 619-627. JSTOR: · Zbl 0455.62036
[5] Darling, D. (1953). On a class of problems related to the random division of an interval. Ann. Math. Statist. 24 239-253. · Zbl 0053.09902
[6] Di Bucchianico, A., Einmahl, J. H. J. and Mushkudiani, N. A. (2001). Smallest nonparametric tolerance regions. Ann. Statist. 29 1320-1343. · Zbl 1043.62045
[7] Dohoho, D. (1982). Breakdown properties of multivariate location estimators. Ph.D. qualifying paper, Harvard Univ.
[8] Donoho, D. and Gasko, M. (1992). Breakdown properties of location estimates based on half-space depth and projected outlyingness. Ann. Statist. 20 1803-1827. · Zbl 0776.62031
[9] Einmahl, J. H. J. and van Zuijlen, M. (1988). Strong bounds for weighted empirical distribution functions based on uniform spacings. Ann. Probab. 16 108-125. · Zbl 0652.60037
[10] Fraser, D. (1951). Sequentially determined statistically equivalent blocks. Ann. Math. Statist. 22 372-381. · Zbl 0043.34401
[11] Guttman, I. (1970). Statistical Tolerance Regions : Classical and Bayesian . Charles Griffin, London. · Zbl 0231.62052
[12] Hall, P. (1986). On powerful distributional tests based on sample spacings. J. Multivariate Anal. 19 201-224. · Zbl 0605.62038
[13] He, X. and Wang, G. (1997). Convergence of depth contours for multivariate datasets. Ann. Statist. 25 495-504. · Zbl 0873.62053
[14] Hodges, J. (1955). A bivariate sign test. Ann. Math. Statistics 26 523-527. · Zbl 0065.12401
[15] Howe, W. G. (1969). Two-sided tolerance limits for normal populations-some improvements. J. Amer. Statist. Assoc. 64 610-620. · Zbl 0181.45701
[16] Liu, R. (1990). On a notion of data depth based on random simplices. Ann. Statist. 18 405-414. · Zbl 0701.62063
[17] Liu, R., Parelius, J. and Singh, K. (1999). Multivariate analysis by data depth: Descriptive statistics, graphics and inference (with discussion). Ann. Statist. 27 783-858. · Zbl 0984.62037
[18] Liu, R. and Singh, K. (1992). Ordering directional data: Concepts of data depth on circles and spheres. Ann. Statist. 20 1468-1484. · Zbl 0766.62027
[19] Liu, R. and Singh, K. (1993). A quality index based on data depth and multivariate rank tests. J. Amer. Statist. Assoc. 88 252-260. JSTOR: · Zbl 0772.62031
[20] Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proc. Nat. Inst. Sci. India 12 49-55. · Zbl 0015.03302
[21] Moran, P. (1947). A random division of an interval. J. Roy. Statist. Soc. Ser. B Stat. Methodol. 9 92-98. · Zbl 0031.06004
[22] Pyke, R. (1965). Spacings. J. Roy. Statist. Soc. Ser. B Stat. Methodol. 27 395-449. JSTOR: · Zbl 0144.41704
[23] Stahel, W. (1981). Robust Schaetzungen: Infinitesmale Optimalitaet und Schaetzungen von Kovarianzmatrizen. Ph.D. thesis, ETH Zurich. · Zbl 0531.62036
[24] Tukey, J. (1947). Nonparametric estimation. II. Statistical equivalent blocks and tolerance regions-the continuous case. Ann. Math. Statist. 18 529-539. · Zbl 0029.15502
[25] Tukey, J. (1975). Mathematics and picturing data. Proceedings of the 1975 International Congress of Mathematics 2 523-531. · Zbl 0347.62002
[26] Wald, A. (1943). An extension of Wilks’ method for setting tolerance limits. Ann. Math. Statist. 14 45-55. · Zbl 0060.30603
[27] Weiss, L. (1957). Asymptotic power of certain tests of fit based on sample spacings. Ann. Math. Statist. 28 783-786. · Zbl 0087.14801
[28] Wells, M., Jammalamadaka, S. R. and Tiwari, R. (1993). Large sample theory of spacings statistics for tests of fit for the composite hypothesis. J. Roy. Statist. Soc. Ser. B Stat. Methodol. 55 189-203. JSTOR: · Zbl 0782.62025
[29] Wilks, S. S. (1941). Determination of sample sizes for setting tolerance limits. Ann. Math. Statist. 12 91-96. · Zbl 0024.42703
[30] Zuo, Y. (2003). Projection based depth functions and associated medians. Ann. Statist. 31 1460-1490. · Zbl 1046.62056
[31] Zuo, Y. and Serfling, R. (2000). Structural properties and convergence results for contours of sample statistical depth functions. Ann. Statist. 28 483-499. · Zbl 1105.62343
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.