×

A boundary method for outlier detection based on support vector domain description. (English) Zbl 1159.68528

Summary: The Support Vector Domain Description (SVDD) is a popular kernel method for outlier detection, which tries to fit a class of data with a sphere and uses a few target objects to support its decision boundary. The problem is that even with a flexible Gaussian kernel function, the SVDD could sometimes generate such a loose decision boundary that the discrimination ability becomes poor. Therefore, a computationally intensive procedure called kernel whitening is often required to improve the performance. In this paper, we propose a simple post-processing method which tries to modify the SVDD boundary in order to achieve a tight data description with no need of kernel whitening. With the derivation of the distance between an object and its nearest boundary point in input space, the proposed method can efficiently construct a new decision boundary based on the SVDD boundary. The improvement from the proposed method is demonstrated with synthetic and real-world datasets. The results show that the proposed decision boundary can fit the shape of synthetic data distribution closely and achieves better or comparable classification performance on real-world datasets.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml; LIBSVM
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Hodge, VJ.; Austin, J., A survey of outlier detection methodologies, Artif Intell Rev, 22, 85-126 (2004) · Zbl 1101.68023
[2] Roberts, S. J., Novelty detection using extreme value statistics, Vision Image Signal Process. IEE Proc., 146, 3, 124-129 (1999)
[3] Parzen, E., On estimation of a probability density function and mode, Ann. Math. Stat., 33, 1065-1076 (1962) · Zbl 0116.11302
[5] Tax, D. M.J.; Duin, R. P.W., Support vector domain description, Pattern Recognition Lett., 20, 1191-1199 (1999)
[6] Tax, D. M.J.; Juszczak, P., Kernel whitening for one-class classification, (Lecture Notes in Computer Science, vol. 2388 (2002), Springer: Springer Berlin), 40-52 · Zbl 1064.68628
[7] Hoffmann, H., Kernel PCA for novelty detection, Pattern Recognition, 40, 3, 863-874 (2007) · Zbl 1118.68140
[10] Bradley, A. P., The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognition, 30, 7, 1145-1159 (1997)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.