Wu, Hau-Tieng; Wu, Nan Think globally, fit locally under the manifold setup: asymptotic analysis of locally linear embedding. (English) Zbl 1405.62058 Ann. Stat. 46, No. 6B, 3805-3837 (2018). Summary: Since its introduction in 2000, locally linear embedding (LLE) has been widely applied in data science. We provide an asymptotical analysis of LLE under the manifold setup. We show that for a general manifold, asymptotically we may not obtain the Laplace–Beltrami operator, and the result may depend on nonuniform sampling unless a correct regularization is chosen. We also derive the corresponding kernel function, which indicates that LLE is not a Markov process. A comparison with other commonly applied nonlinear algorithms, particularly a diffusion map, is provided and its relationship with locally linear regression is also discussed. Cited in 10 Documents MSC: 62H12 Estimation in multivariate analysis 58J99 Partial differential equations on manifolds; differential operators 62G05 Nonparametric estimation 62J05 Linear regression; mixed models 68T05 Learning and adaptive systems in artificial intelligence Keywords:locally linear embedding; diffusion maps; dimension reduction; locally linear regression; measurement error Software:t-SNE × Cite Format Result Cite Review PDF Full Text: DOI arXiv Euclid References: [1] Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput.15 1373–1396. · Zbl 1085.68119 · doi:10.1162/089976603321780317 [2] Belkin, M. and Niyogi, P. (2005). Towards a theoretical foundation for Laplacian-based manifold methods. In Learning Theory. Lecture Notes in Computer Science3559 486–500. Springer, Berlin. · Zbl 1137.68521 [3] Belkin, M. and Niyogi, P. (2007). Convergence of Laplacian eigenmaps. In Advances in Neural Information Processing Systems 19 (NIPS 2006) 129–136. MIT Press, Cambridge, MA. [4] Bérard, P., Besson, G. and Gallot, S. (1994). Embedding Riemannian manifolds by their heat kernel. Geom. Funct. Anal.4 373–398. · Zbl 0806.53044 · doi:10.1007/BF01896401 [5] Bérard, P. H. (1986). Spectral Geometry: Direct and Inverse Problems. Lecture Notes in Math.1207. Springer, Berlin. · Zbl 0608.58001 [6] Cheeger, J., Gromov, M. and Taylor, M. (1982). Finite propagation speed, kernel estimates for functions of the Laplace operator, and the geometry of complete Riemannian manifolds. J. Differential Geom.17 15–53. · Zbl 0493.53035 · doi:10.4310/jdg/1214436699 [7] Cheng, M.-Y. and Wu, H.-T. (2013). Local linear regression on manifolds and its geometric interpretation. J. Amer. Statist. Assoc.108 1421–1434. · Zbl 1426.62402 · doi:10.1080/01621459.2013.827984 [8] Coifman, R. R. and Lafon, S. (2006). Diffusion maps. Appl. Comput. Harmon. Anal.21 5–30. · Zbl 1095.68094 · doi:10.1016/j.acha.2006.04.006 [9] Devroye, L. P. and Wagner, T. J. (1977). The strong uniform consistency of nearest neighbor density estimates. Ann. Statist.5 536–540. · Zbl 0367.62061 · doi:10.1214/aos/1176343851 [10] do Carmo, M. P. and Flaherty, F. (1992). Riemannian Geometry. Birkhäuser, Boston, MA. · Zbl 0752.53001 [11] Donoho, D. L., Gavish, M. and Johnstone, I. M. (2013). Optimal shrinkage of eigenvalues in the spiked covariance model. Available at arXiv:1311.0851. · Zbl 1403.62099 · doi:10.1214/17-AOS1601 [12] Donoho, D. L. and Grimes, C. (2003). Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. USA100 5591–5596. · Zbl 1130.62337 · doi:10.1073/pnas.1031596100 [13] El Karoui, N. (2010). On information plus noise kernel random matrices. Ann. Statist.38 3191–3216. · Zbl 1200.62056 · doi:10.1214/10-AOS801 [14] El Karoui, N. and Wu, H.-T. (2016). Graph connection Laplacian methods can be made robust to noise. Ann. Statist.44 346–372. · Zbl 1350.60036 · doi:10.1214/14-AOS1275 [15] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman & Hall/CRC, Boca Raton, FL. · Zbl 0873.62037 [16] Gao, T. (2016). The diffusion geometry of fibre bundles. Available at arXiv:1602.02330. [17] Garcia Trillos, N. and Slepcev, D. (2018). A variational approach to the consistency of spectral clustering. Appl. Comput. Harmon. Anal. To appear. · Zbl 1396.49013 · doi:10.1016/j.acha.2016.09.003 [18] Giné, E. and Koltchinskii, V. (2006). Empirical graph Laplacian approximation of Laplace–Beltrami operators: Large sample results. In High Dimensional Probability. Institute of Mathematical Statistics Lecture Notes—Monograph Series51 238–259. IMS, Beachwood, OH. · Zbl 1124.60030 [19] Hein, M., Audibert, J.-Y. and von Luxburg, U. (2005). From graphs to manifolds—Weak and strong pointwise consistency of graph Laplacians. In Learning Theory. Lecture Notes in Computer Science3559 470–485. Springer, Berlin. · Zbl 1095.68097 [20] Johnstone, I. M. (2006). High dimensional statistical inference and random matrices. Available at arXiv:math/0611589v1. [21] Moore, D. S. and Yackel, J. W. (1977). Consistency properties of nearest neighbor density function estimators. Ann. Statist.5 143–154. · Zbl 0358.60053 · doi:10.1214/aos/1176343747 [22] Roweis, S. T. and Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science290 2323–2326. [23] Singer, A. (2006). From graph to manifold Laplacian: The convergence rate. Appl. Comput. Harmon. Anal.21 128–134. · Zbl 1095.68102 · doi:10.1016/j.acha.2006.03.004 [24] Singer, A. and Wu, H.-T. (2012). Vector diffusion maps and the connection Laplacian. Comm. Pure Appl. Math.65 1067–1144. · Zbl 1320.68146 · doi:10.1002/cpa.21395 [25] Singer, A. and Wu, H.-T. (2013). 2-D tomography from noisy projections taken at unknown random directions. SIAM J. Imaging Sci.6 136–175. · Zbl 1279.68339 · doi:10.1137/090764657 [26] Singer, A. and Wu, H.-T. (2017). Spectral convergence of the connection Laplacian from random samples. Inf. Inference6 58–123. · Zbl 1386.94055 [27] Smolyanov, O., Weizsacker, H. v. and Wittich, O. (2007). Chernoff’s theorem and discrete time approximations of Brownian motion on manifolds. Potential Anal.26 1–29. · Zbl 1107.58014 · doi:10.1007/s11118-006-9019-z [28] Stein, E. M. and Weiss, G. (2016). Introduction to Fourier Analysis on Euclidean Spaces. PMS32. Princeton Univ. Press, Princeton, NJ. · Zbl 0232.42007 [29] Tenenbaum, J. B., de Silva, V. and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science290 2319–2323. [30] van der Maaten, L. and Hinton, G. (2008). Visualizing data using t-SNE. J. Mach. Learn. Res.9 2579–2605. · Zbl 1225.68219 [31] von Luxburg, U., Belkin, M. and Bousquet, O. (2008). Consistency of spectral clustering. Ann. Statist.36 555–586. · Zbl 1133.62045 · doi:10.1214/009053607000000640 [32] Wang, X. (2015). Spectral convergence rate of graph Laplacian. Available at arXiv:1510.08110. [33] Weinberger, K. Q. and Saul, L. K. (2006). An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In AAAI 1683–1686. [34] Wu, H.-T. and Wu, N. (2018). Supplement to “Think globally, fit locally under the manifold setup: Asymptotic analysis of locally linear embedding.” DOI:10.1214/17-AOS1676SUPP. [35] Zhang, Z. and Wang, J. (2006). MLLE: Modified locally linear embedding using multiple weights. In Advances in Neural Information Processing Systems 19 (NIPS 2006) 1593–1600. MIT Press, Cambridge, MA. [36] Zhang, Z. and Zha, H. · Zbl 1077.65042 · doi:10.1137/S1064827502419154 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.