×

DD-classifier: nonparametric classification procedure based on DD-plot. (English) Zbl 1261.62058

Summary: Using the DD-plot (depth vs. depth plot), we introduce a new nonparametric classification algorithm and call it DD-classifier. The algorithm is completely nonparametric, and it requires no prior knowledge of the underlying distributions or the form of the separating curve. Thus, it can be applied to a wide range of classification problems. The algorithm is completely data driven and its classification outcome can be easily visualized in a two-dimensional plot regardless of the dimension of the data. Moreover, it has the advantage of bypassing the estimation of underlying parameters such as means and scales, which is often required by the existing classification procedures. We study the asymptotic properties of the DD-classifier and its misclassification rate. Specifically, we show that DD-classifier is asymptotically equivalent to the Bayes rule under suitable conditions, and can achieve Bayes error for a family broader than elliptical distributions. The performance of the classifier is also examined using simulated and real data sets. Overall, the DD-classifier performs well across a broad range of settings, and compares favorably with existing classifiers. It can also be robust against outliers or contamination.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62G20 Asymptotic properties of nonparametric inference
62A09 Graphical methods in statistics
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Arcones M., The Annals of Statistics 22 pp 1460– (1994) · Zbl 0807.60008
[2] Christmann A., Computational Statistics 17 pp 273– (2002) · Zbl 1010.62054
[3] Christmann A., Computational Statistics & Data Analysis 37 pp 65– (2001) · Zbl 1051.62065
[4] Cuesta-Albertos J. A., Probability Theory & Related Fields 78 pp 523– (1988) · Zbl 0628.60010
[5] Cuesta-Albertos J., Computational Statistics & Data Analysis 52 pp 4979– (2008) · Zbl 1452.62344
[6] Cui X., Communications in Statistics: Theory and Methods 37 pp 2276– (2008) · Zbl 1143.62037
[7] Donoho D., The Annals of Statistics 20 pp 1803– (1992) · Zbl 0776.62031
[8] Dümbgen L., Statistics & Probability Letters 14 pp 119– (1992) · Zbl 0758.60030
[9] Dutta S., The Annals of the Institute of Statistical Mathematics 64 pp 657– (2012) · Zbl 1237.62080
[10] Friedman J., The Annals of Statistics 29 pp 1189– (2001) · Zbl 1043.62034
[11] Ghosh A., Bernoulli 11 pp 1– (2005) · Zbl 1059.62064
[12] Ghosh A., Scandinavian Journal of Statistics 32 pp 327– (2005) · Zbl 1089.62075
[13] Hodges J., The Annals of Mathematical Statistics 26 pp 523– (1955) · Zbl 0065.12401
[14] Hsu C., A Practical Guide to Support Vector Classification (2010)
[15] Li J., Statistical Science 19 pp 686– (2004) · Zbl 1100.62564
[16] Liu R., The Annals of Statistics 18 pp 405– (1990) · Zbl 0701.62063
[17] Liu R., The Annals of Statistics 27 pp 783– (1999)
[18] Mahalanobis P., Proceedings of the National Academy India 12 pp 49– (1936)
[19] Rousseeuw P., Journals of the American Statistical Association 94 pp 388– (1999)
[20] Rousseeuw P., Statistics and Computing 8 pp 193– (1998)
[21] Tukey J., Proceedings of the 1975 International Congress of Mathematics 2 pp 523– (1975)
[22] Yeh I., Expert Systems With Applications 36 pp 5866– (2009)
[23] Zuo Y., The Annals of Statistics 31 pp 1460– (2003) · Zbl 1046.62056
[24] Zuo Y., The Annals of Statistics 28 pp 461– (2000) · Zbl 1106.62334
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.