LargeVis swMATH ID: 34905 Software Authors: Tang, Jian; Liu, Jingzhou; Zhang, Ming; Mei, Qiaozhu Description: LargeVis: Visualizing Large-scale and High-dimensional Data. We study the problem of visualizing large-scale and high-dimensional data in a low-dimensional (typically 2D or 3D) space. Much success has been reported recently by techniques that first compute a similarity structure of the data points and then project them into a low-dimensional space with the structure preserved. These two steps suffer from considerable computational costs, preventing the state-of-the-art methods such as the t-SNE from scaling to large-scale and high-dimensional data (e.g., millions of data points and hundreds of dimensions). We propose the LargeVis, a technique that first constructs an accurately approximated K-nearest neighbor graph from the data and then layouts the graph in the low-dimensional space. Comparing to t-SNE, LargeVis significantly reduces the computational cost of the graph construction step and employs a principled probabilistic model for the visualization step, the objective of which can be effectively optimized through asynchronous stochastic gradient descent with a linear time complexity. The whole procedure thus easily scales to millions of high-dimensional data points. Experimental results on real-world data sets demonstrate that the LargeVis outperforms the state-of-the-art methods in both efficiency and effectiveness. The hyper-parameters of LargeVis are also much more stable over different data sets. Homepage: https://arxiv.org/abs/1602.00370 Source Code: https://github.com/lferry007/LargeVis Related Software: UMAP; largeVis; t-SNE; Scikit; MNIST; node2vec; DeepWalk; UCI-ml; GitHub; ForceAtlas2; COIL-100; Gephi; Numba; darch; word2vec; DeepView; openTSNE; TriMap; FIt-SNE; PTE Cited in: 9 Documents all top 5 Cited by 29 Authors 2 Huang, Hayang 2 Rudin, Cynthia 1 Chen, Chaofan 1 Chen, Yuwei 1 Chen, Zhi 1 Corander, Jukka 1 De Bie, Tijl 1 Dyballa, Luciano 1 García-García, Darío 1 Gentner, Timothy Q. 1 Jullum, Martin 1 Kang, Bo 1 Kaski, Samuel 1 Kefato, Zekarias 1 Lijffijt, Jefrey 1 Løland, Anders 1 McInnes, Leland 1 Montresor, Alberto 1 Sainburg, Tim 1 Santos-Rodríguez, Raúl 1 Sedov, Denis 1 Semenova, Lesia 1 Shaposhnik, Yaron 1 Sheikh, Nasrullah 1 Tjøstheim, Dag B. 1 Wang, Yingfan 1 Yang, Zhirong 1 Zhong, Chudi 1 Zucker, Steven Warren all top 5 Cited in 8 Serials 2 Neural Computation 1 Computing 1 Statistical Science 1 Machine Learning 1 Cybernetics and Systems Analysis 1 Journal of Machine Learning Research (JMLR) 1 Statistics Surveys 1 Statistics and Computing Cited in 2 Fields 7 Computer science (68-XX) 3 Statistics (62-XX) Citations by Year