## Distance metric learning for large margin nearest neighbor classification.(English)Zbl 1235.68204

Summary: The accuracy of $$k$$-nearest neighbor ($$k$$NN) classification depends significantly on the metric used to compute distances between different examples. In this paper, we show how to learn a Mahalanobis distance metric for $$k$$NN classification from labeled examples. The Mahalanobis metric can equivalently be viewed as a global linear transformation of the input space that precedes $$k$$NN classification using Euclidean distances. In our approach, the metric is trained with the goal that the $$k$$-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. As in support vector machines (SVMs), the margin criterion leads to a convex optimization based on the hinge loss. Unlike learning in SVMs, however, our approach requires no modification or extension for problems in multiway (as opposed to binary) classification. In our framework, the Mahalanobis distance metric is obtained as the solution to a semidefinite program. On several data sets of varying size and difficulty, we find that metrics trained in this way lead to significant improvements in $$k$$NN classification. Sometimes these results can be further improved by clustering the training examples and learning an individual metric within each cluster. We show how to learn and combine these local metrics in a globally integrated manner.

### MSC:

 68T05 Learning and adaptive systems in artificial intelligence 90C25 Convex programming

LMNN; Bow
Full Text: