## Estimating the support of a high-dimensional distribution.(English)Zbl 1009.62029

Summary: Suppose you are given some data set drawn from an underlying probability distribution $$P$$ and you want to estimate a “simple” subset $$S$$ of input space such that the probability that a test point drawn from $$P$$ lies outside of $$S$$ equals some a priori specified value between 0 and 1.
We propose a method to approach this problem by trying to estimate a function $$f$$ that is positive on $$S$$ and negative on the complement. The functional form of $$f$$ is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

### MSC:

 62G07 Density estimation 90C90 Applications of mathematical programming