##
**Parallel algorithms for nearest neighbor search problems in high dimensions.**
*(English)*
Zbl 1349.68231

### MSC:

68T20 | Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) |

68T05 | Learning and adaptive systems in artificial intelligence |

68W10 | Parallel algorithms in computer science |

68W15 | Distributed algorithms |

### Keywords:

nearest neighbor algorithms; computational statistics; tree codes; data analysis; parallel algorithms; machine learning
PDF
BibTeX
XML
Cite

\textit{B. Xiao} and \textit{G. Biros}, SIAM J. Sci. Comput. 38, No. 5, S667--S699 (2016; Zbl 1349.68231)

Full Text:
DOI

### References:

[1] | P. Kegelmeyer, R. Calderbank, T. Critchlow, L. Jameson, C. Kamath, J. Meza, N. Samatova, and A. Wilson, Mathematics of Analysis of Petascale Data, Technical report, DOE Office of Science, 2008. |

[2] | D. Aiger, E. Kokiopoulou, and E. Rivlin, Random grids: Fast approximate nearest neighbors and range searching for image search, in Proceedings of the 2013 IEEE International Conference on Computer Vision (ICCV), IEEE Press, Piscataway, NJ, 2013, pp. 3471–3478. |

[3] | I. Al-Furajh, S. Aluru, S. Goil, and S. Ranka, Parallel construction of multidimensional binary search trees, IEEE Trans. Parallel Distributed Syst., 11 (2000), pp. 136–148. |

[4] | A. Andoni and P. Indyk, E\textup2LSH \textup0.1 User Manual, 2005, http://www.mit.edu/ andoni/LSH/manual.pdf. |

[5] | A. Andoni and P. Indyk, Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions, Comm. ACM, 51 (2008), p. 117. |

[6] | A. Beygelzimer, S. Kakade, and J. Langford, Cover trees for nearest neighbor, in Proceedings of the Twenty-Third International Conference on Machine Learning (ICML’06), ACM Int. Conf. Proc. Ser. 148, ACM, New York, 2006, pp. 97–104. |

[7] | P. Ciaccia, M. Patella, and P. Zezula, M-tree: An efficient access method for similarity search in metric spaces, in Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB’97), Morgan Kaufmann, New York, 1997, pp. 426–435. |

[8] | R. R. Curtin, J. R. Cline, N. P. Slagle, W. B. March, P. Ram, N. A. Mehta, and A. G. Gray, MLPACK: A scalable C++ machine learning library, J. Machine Learning Res., 14 (2013), pp. 801–805. · Zbl 1307.68066 |

[9] | S. Dasgupta and Y. Freund, Random projection trees and low dimensional manifolds, in Proceedings of the 40th Annual ACM Symposium on Theory of Computing, ACM, New York, 2008, pp. 537–546. · Zbl 1231.68114 |

[10] | K. D. Devine, E. G. Boman, and G. Karypis, Partitioning and load balancing for emerging parallel applications and architectures, in Parallel Processing for Scientific Computing, M. A. Heroux, P. Raghavan, and H. D. Simon, eds., Software Environ. Tools 20, SIAM, Philadelphia, 2006, pp. 99–126, . |

[11] | I. Dhillon and D. Modha, A data-clustering algorithm on distributed memory multiprocessors, in Large-Scale Parallel Data Mining, Proceedings of the Workshop on Large-Scale Parallel KDD Systems (SIGKDD), 2000, pp. 802–802. |

[12] | J. H. Freidman, J. L. Bentley, and R. A. Finkel, An algorithm for finding best matches in logarithmic expected time, ACM Trans. Math. Softw., 3 (1977), pp. 209–226. · Zbl 0364.68037 |

[13] | J. Friedman, T. Hastie, and R. Tibshirani, The Elements of Statistical Learning, Springer Ser. Stat. 1, Springer, New York, 2001. · Zbl 0973.62007 |

[14] | K. Fukunage and P. M. Narendra, A branch and bound algorithm for computing k-nearest neighbors, IEEE Trans. Comput., 24 (1975), pp. 750–753. · Zbl 0307.68069 |

[15] | B. Ganapathysubramanian and N. Zabaras, A non-linear dimension reduction methodology for generating data-driven stochastic input models, J. Comput. Phys., 227 (2008), pp. 6612–6637. · Zbl 1141.82316 |

[16] | V. Garcia, E. Debreuve, and M. Barlaud, Fast k Nearest Neighbor Search Using GPU, preprint, arXiv:0804.1448, 2008. |

[17] | F. Gieseke, J. Heinermann, C. Oancea, and C. Igel, Buffer k-d trees: Processing massive nearest neighbor queries on GPUs, in Proceedings of the 31st International Conference on Machine Learning (ICML), J. Mach. Learning Res., 32 (2014), pp. 172–180. |

[18] | A. G. Gray and A. W. Moore, ‘N-body’ problems in statistical learning, Adv. Neural Inform. Process. Syst., 13 (2001), pp. 521–527. |

[19] | P. W. Jones, A. Osipov, and V. Rokhlin, Randomized approximate nearest neighbors algorithm, Proc. Natl. Acad. Sci. USA, 108 (2011), pp. 15679–15686. |

[20] | D. Karger and M. Ruhl, Finding nearest neighbors in growth-restricted metrics, in Proceeding of the Thirty-Fourth Annual ACM Symposium on Theory of Computing (STOC), ACM, New York, 2002, pp. 741–750. · Zbl 1192.68750 |

[21] | R. Krauthgamer and J. R. Lee, Navigating nets: Simple algorithms for proximity search, in Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM, Philadelphia, 2004, pp. 791–801. · Zbl 1318.68071 |

[22] | I. Lashuk, A. Chandramowlishwaran, T.-A. Nguyen, H. Langston, R. Sampath, A. Shringarpure, R. Vuduc, D. Zorin, L. Ying, and G. Biros, A massively parallel adaptive fast-multipole method on heterogeneous architectures, in Proceedings of the 2009 ACM/IEEE Conference on Supercomputing (SC ’09), IEEE Press, Piscataway, NJ, 2009, pp. 1–12. |

[23] | M. Lichman, UCI Machine Learning Repository, , 2013. |

[24] | W. B. March, B. Xiao, and G. Biros, ASKIT: Approximate skeletonization kernel-independent treecode in high dimensions, SIAM J. Sci. Comput., 37 (2015), pp. 1089–1110, . · Zbl 1320.65196 |

[25] | W. B. March, B. Xiao, C. Yu, and G. Biros, An algebraic parallel treecode in arbitrary dimensions, in Proceedings of the 29th IEEE International Parallel and Distributed Computing Symposium (IPDPS), IEEE Press, Piscataway, NJ, 2015, pp. 571–580. |

[26] | G. L. Miller, S. H. Teng, W. Thurston, and S. A. Vavasis, Separators for sphere-packings and nearest neighbor graphs, J. ACM, 44 (1997), pp. 1–29. · Zbl 0883.68100 |

[27] | D. M. Mount and S. Arya, ANN: A library for approximate nearest neighbor searching, v. 1.1.2, 2010, . |

[28] | M. Muja and D. G. Lowe, Fast approximate nearest neighbors with automatic algorithm configuration, in Proceedings of the International Conference on Computer Vision Theory and Application (VISSAPP’09), INSTICC Press, Setubal, Portugal, 2009, pp. 331–340; FLANN library at . |

[29] | M. Muja and D. G. Lowe, Fast approximate nearest neighbors with automatic algorithm configuration, in Proceedings of the International Conference on Computer Vision Theory and Applications (VISSAPP’09), INSTICC Press, Setubal, Portugal, 2009, pp. 331–340. |

[30] | M. Muja and D. G. Lowe, Scalable nearest neighbor algorithms for high dimensional data, IEEE Trans. Pattern Anal. Machine Intell., 36 (2014), pp. 2227–2240. |

[31] | S. M. Omohundro, Efficient algorithms with neural network behavior, J. Complex Systems, 2 (1987), pp. 273–347. · Zbl 0659.68101 |

[32] | R. Ostrovsky, Y. Rabani, L. J. Schulman, and C. Swamy, The effectiveness of Lloyd-type methods for the k-means problem, in Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), IEEE Press, Piscataway, NJ, 2006, pp. 165–176. · Zbl 1281.68229 |

[33] | J. Pan, C. Lauterbach, and D. Manocha, Efficient nearest-neighbor computation for GPU-based motion planning, in Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE Press, Piscataway, NJ, 2010, pp. 2243–2248. |

[34] | J. Pan and D. Manocha, Fast gpu-based locality sensitive hashing for k-nearest neighbor computation, in Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL GIS), ACM, New York, 2011, pp. 211–220. |

[35] | A. Rahimian, I. Lashuk, S. K. Veerapaneni, A. Chandramowlishwaran, D. Malhotra, L. Moon, R. Sampath, A. Shringarpure, J. Vetter, R. Vuduc, D. Zorin, and G. Biros, Petascale direct numerical simulation of blood flow on 200k cores and heterogeneous architectures, in Proceedings of the 2010 ACM/IEEE Conference on Supercomputing (SC ’10), IEEE Press, Piscataway, NJ, 2010, pp. 1–12. |

[36] | P. Ram, D. Lee, W. March, and A. Gray, Linear-time algorithms for pairwise statistical problems, Adv. Neural Inform. Process. Syst., 23 (2009), pp. 1527–1535. |

[37] | C. E. Rasmussen and C. Williams, Gaussian Processes for Machine Learning, MIT Press, Cambridge, MA, 2006. · Zbl 1177.68165 |

[38] | S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science, 290 (2000), pp. 2323–2326. |

[39] | H. Samet, Foundations of Multidimensional and Metric Data Structures, Morgan Kaufmann, New York, 2006. · Zbl 1139.68022 |

[40] | B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, Cambridge, MA, 2002. |

[41] | N. Sismanis, N. Pitsianis, and X. Sun, Parallel search of k-nearest neighbors with synchronous operations, in Proceedings of the IEEE Conference on High Performance Extreme Computing, IEEE Press, Piscataway, NJ, 2012, pp. 1–6. |

[42] | H. Sundar, D. Malhotra, and G. Biros, HykSort: A new variant of hypercube quicksort on distributed memory architectures, in Proceedings of the International Conference on Supercomputing (ICS’13), ACM, New York, 2013, pp. 293–302. |

[43] | J. Tenenbaum, V. Silva, and J. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), pp. 2319–2323. |

[44] | A. Torralba, R. Fergus, and W. T. Freeman, 80 million tiny images: A large data set for nonparametric object and scene recognition, IEEE Trans. Pattern Anal. Machine Intell., 30 (2008), pp. 1958–1970. |

[45] | J. K. Uhlmann, Satisfying general proximity/similarity queries with metric trees, Inform. Process. Lett., 40 (1991), pp. 175–179. · Zbl 0748.68088 |

[46] | R. Weber, H. J. Schek, and S. Blott, A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces, in Proceedings of the 24th International Conference on Very Large Data Bases (VLDB’98), Morgan Kaufmann, San Francisco, 1998, pp. 194–205. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.