Fast exact \(k\) nearest neighbors search using an orthogonal search tree.

*(English)*Zbl 1192.68587Summary: The problem of \(k\) nearest neighbors \((kNN)\) is to find the nearest \(k\) neighbors for a query point from a given data set. In this paper, a novel fast \(kNN\) search method using an orthogonal search tree is proposed. The proposed method creates an orthogonal search tree for a data set using an orthonormal basis evaluated from the data set. To find the \(kNN\) for a query point from the data set, projection values of the query point onto orthogonal vectors in the orthonormal basis and a node elimination inequality are applied for pruning unlikely nodes. For a node, which cannot be deleted, a point elimination inequality is further used to reject impossible data points. Experimental results show that the proposed method has good performance on finding \(kNN\) for query points and always requires less computation time than available \(kNN\) search algorithms, especially for a data set with a big number of data points or a large standard deviation.

##### MSC:

68T10 | Pattern recognition, speech recognition |

PDF
BibTeX
XML
Cite

\textit{Y.-C. Liaw} et al., Pattern Recognition 43, No. 6, 2351--2358 (2010; Zbl 1192.68587)

Full Text:
DOI

##### References:

[1] | Theodoridis, S.; Koutroumbas, K., Pattern recognition, (2003), Academic Press |

[2] | Murase, H.; Nayar, S.K., Visual learning and recognition of 3D objects from appearance, International journal of computer vision, 14, 1, 5-24, (1995) |

[3] | Roth, V.; Laub, J.; Kawanabe, M.; Buhmann, J.M., Optimal cluster preserving embedding of nonmetric proximity data, IEEE transactions on pattern analysis and machine intelligence, 25, 12, 1540-1551, (2003) |

[4] | Liaw, Y.C., Improvement of the fast exact pairwise-nearest-neighbor algorithm, Pattern recognition, 42, 5, 867-870, (2009) · Zbl 1178.68502 |

[5] | Chen, C.Y.; Chang, C.C.; Lee, R.C.T., A near pattern-matching scheme based on principal component analysis, Pattern recognition letters, 16, 339-345, (1995) |

[6] | Liaw, Y.C.; Lai, J.Z.C.; Lo, W., Image restoration of compressed image using classified vector quantization, Pattern recognition, 35, 2, 329-340, (2002) · Zbl 0993.68135 |

[7] | Gersho, A.; Gray, R.M., Vector quantization and signal compression, (1991), Kluwer Academic Publishers Boston, MA |

[8] | Cover, T.M.; Hart, P.E., Nearest neighbor pattern classification, IEEE transactions on information theory, 13, 1, 21-27, (1967) · Zbl 0154.44505 |

[9] | García-Pedrajas, N.; Ortiz-Boyer, D., Boosting k-nearest neighbor classifier by means of input space projection, Expert systems with applications, 36, 7, 10570-10582, (2009) |

[10] | Fukunaga, K.; Narendra, P.M., A branch and bound algorithm for computing k-nearest neighbors, IEEE transactions on computers, C-24, 7, 750-753, (1975) · Zbl 0307.68069 |

[11] | Liu, T.; Moore, A.W.; Gray, A., New algorithms for efficient high-dimensional nonparametric classification, Journal of machine learning research, 7, 1135-1158, (2006) · Zbl 1222.62077 |

[12] | S.M. Omohundro, Five balltree construction algorithms, Technical Report 89-063, (1989). |

[13] | Friedman, J.H.; Bentley, J.L.; Finkel, R.A., An algorithm for finding best matches in logarithmic expected time, ACM transactions on mathematical software, 3, 3, 209-226, (1977) · Zbl 0364.68037 |

[14] | Sproull, R.F., Refinements to nearest-neighbor searching in k-dimensional trees, Algorithmica, 6, 579-589, (1991) · Zbl 0726.68023 |

[15] | Kim, B.S.; Park, S.B., A fast k nearest neighboring finding algorithm based on the ordered partition, IEEE transactions on pattern analysis and machine intelligence, 8, 6, 761-766, (1986) · Zbl 0621.68052 |

[16] | Mico, L.; Oncina, J.; Carrasco, R.C., A fast branch and bound nearest neighbor classifier in metric spaces, Pattern recognition letters, 17, 7, 731-739, (1996) |

[17] | McNames, J., A fast nearest-neighbor algorithm based on a principal axis search tree, IEEE transactions on pattern analysis and machine intelligence, 23, 9, 964-976, (2001) |

[18] | Wang, B.; Gan, J.Q., Integration of projected clusters and principal axis trees for high-dimensional data indexing and query, Lecture notes in computer science, 3177, 191-196, (2004) |

[19] | Chen, Y.-S.; Hung, Y.-P.; Ten, T.-F.; Fuh, C.-S., Fast and versatile algorithm for nearest neighbor search based on a lower bound tree, Pattern recognition, 40, 2, 360-375, (2007) · Zbl 1118.68053 |

[20] | Cheng, D.-Y.; Gersho, A.; Ramamurthi, B.; Shoham, Y., Fast search algorithms for vector quantization and pattern matching, IEEE international conference on acoustics, speech, and signal processing, 1, (1984), 9.11.1-9.11.4 |

[21] | Bei, C.D.; Gray, R.M., An improvement of the minimum distortion encoding algorithm for vector quantization, IEEE transactions on communications, 33, 10, 1132-1133, (1985) |

[22] | Ra, S.W.; Kim, J.K., Fast Mean-distance-ordered partial codebook search algorithm for image vector quantization, IEEE transactions on circuits and systems II: analog and digital signal processing, 40, 9, 576-579, (1993) |

[23] | Tai, S.C.; Lai, C.C.; Lin, Y.C., Two fast nearest neighbor searching algorithms for image vector quantization, IEEE transactions on communications, 44, 12, 1623-1628, (1996) |

[24] | Nene, S.A.; Nayar, S.K., A simple algorithm for nearest neighbor search in high dimensions, IEEE transactions on pattern analysis and machine intelligence, 19, 9, 989-1003, (1997) |

[25] | Lu, Z.-M.; Chu, S.-C.; Huang, K.-C., Equal-average equal-variance equal-norm nearest neighbor codeword search algorithm based on ordered Hadamard transform, International journal of innovative computing, information and control, 1, 1, 35-41, (2005) |

[26] | Lai, J.Z.C.; Liaw, Y.C.; Liu, J., Fast k-nearest-neighbor search based on projection and triangular inequality, Pattern recognition, 40, 1, 351-359, (2007) · Zbl 1118.68055 |

[27] | Kanungo, T.; Mount, D.; Netanyahu, N.; Piatko, C.; Silverman, R.; Wu, A., An efficient k-means clustering algorithm: analysis and implementation, IEEE transactions on pattern analysis and machine intelligence, 24, 7, 881-892, (2002) |

[28] | Jolliffe, I.T., Principal component analysis, (2002), Springer NY · Zbl 1011.62064 |

[29] | Cooley, W.H.; Lohnes, P.R., Multivariate data analysis, (1971), Wiley New York · Zbl 0252.62002 |

[30] | Statlog (Landsat Satellite) data set, 〈http://archive.ics.uci.edu/ml〉. |

[31] | Linde, Y.; Buzo, A.; Gray, R.M., An algorithm for vector quantizer design, IEEE transactions on communications, 28, 1, 84-95, (1980) |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.