Accelerating \(t\)-SNE using tree-based algorithms. (English) Zbl 1319.62134

Summary: The paper investigates the acceleration of \(t\)-SNE-an embedding technique that is commonly used for the visualization of high- dimensional data in scatter plots – using two tree-based algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning \(t\)-SNE embeddings in \(\mathcal{O}(N \log N)\). Our experiments show that the resulting algorithms substantially accelerate \(t\)-SNE, and that they make it possible to learn embeddings of data sets with millions of objects. Somewhat counterintuitively, the Barnes-Hut variant of \(t\)-SNE appears to outperform the dual-tree variant.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62A09 Graphical methods in statistics
Full Text: Link