Learning gradients on manifolds. (English) Zbl 1200.62070

Summary: A common belief in high-dimensional data analysis is that data are concentrated on a low-dimensional manifold. This motivates simultaneous dimension reduction and regression on manifolds. We provide an algorithm for learning gradients on manifolds for dimension reduction for high-dimensional data with few observations. We obtain generalization error bounds for the gradient estimates and show that the convergence rate depends on the intrinsic dimension of the manifold and not on the dimension of the ambient space. We illustrate the efficacy of this approach empirically on simulated and real data and compare the method to other dimension reduction procedures.


62H99 Multivariate analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)
53B99 Local differential geometry
65C60 Computational problems in statistics (MSC2010)


Full Text: DOI arXiv


[1] Aronszajn, N. (1950). Theory of reproducing kernels. Trans. Amer. Math. Soc. 68 337-404. · Zbl 0037.20701
[2] Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation 15 1373-1396. · Zbl 1085.68119
[3] Bickel, P. and Li, B. (2007). Local polynomial regression on unknown manifolds. In Complex Datasets and Inverse Problems: Tomography, Networks and Beyond (R. Liu, W. Strawderman and C.-H. Zhang, eds.). IMS Lecture Notes-Monograph Series 54 177-186. Beachwood, OH: Inst. Math. Statist.
[4] Chen, S., Donoho, D. and Saunders, M. (1999). Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20 33-61. · Zbl 0919.94002
[5] Cook, R. and Li, B. (2002). Dimension reduction for conditional mean in regression. Ann. Statist. 30 455-474. · Zbl 1012.62035
[6] Cook, R. and Weisberg, S. (1991). Discussion of “sliced inverse regression for dimension reduction”. J. Amer. Statist. Assoc. 86 328-332. · Zbl 0742.62044
[7] do Carmo, M.P. (1992). Riemannian Geometry . Boston, MA: Birkhäuser. · Zbl 0752.53001
[8] Donoho, D. and Grimes, C. (2003). Hessian eigenmaps: New locally linear embedding techniques for highdimensional data. Proc. Natl. Acad. Sci. 100 5591-5596. · Zbl 1130.62337
[9] Giné, E. and Koltchinskii, V. (2005). Empirical graph Laplacian approcimation of Laplace-Beltrami operators: Large sample results. In High Dimensional Probability IV (E. Giné, V. Koltchinskii, W. Li and J. Zinn, eds.). Beachwood, OH: Birkhäuser. · Zbl 1124.60030
[10] Golub, G. and Loan, C.V. (1983). Matrix Computations . Baltimore, MD: Johns Hopkins Univ. Press. · Zbl 0559.65011
[11] Golub, T., Slonim, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Downing, J., Caligiuri, M., Bloomfield, C. and Lander, E. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286 531-537.
[12] Guyon, I., Weston, J., Barnhill, S. and Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Mach. Learn. 46 389-422. · Zbl 0998.68111
[13] Li, K. (1991). Sliced inverse regression for dimension reduction. J. Amer. Statist. Assoc. 86 316-342. · Zbl 0742.62044
[14] Liang, F., Mukherjee, S. and West, M. (2007). Understanding the use of unlabelled data in predictive modeling. Statist. Sci. . 22 189-205. · Zbl 1246.62157
[15] Mukherjee, S. and Wu, Q. (2006). Estimation of gradients and coordinate covariation in classification. J. Mach. Learn. Res. 7 2481-2514. · Zbl 1222.62078
[16] Mukherjee, S. and Zhou, D. (2006). Learning coordinate covariances via gradients. J. Mach. Learn. Res. 7 519-549. · Zbl 1222.68270
[17] Wu, Q., Maggioni, M., Guinney, J. and Mukherjee, S. (2008). Learning gradients: Predictive models that infer geometry and dependence. Technical report. · Zbl 1242.62064
[18] Roweis, S. and Saul, L. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science 290 2323-2326.
[19] Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S., Loda, M., Kantoff, P.W., Golub, T.R. and Sellers, W.R. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1 203-209.
[20] Tenenbaum, J., de Silva, V. and Langford, J. (2000). A global geometric framework for nonlinear dimensionality reduction. Science 290 2319-2323.
[21] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[22] West, M. (2003). Bayesian factor regression models in the “large p , small n ” paradigm. In Bayesian Statistics 7 (J. Bernardo, M.J. Bayarri and A.P. Dawid, eds.) 723-732. New York: Oxford Univ. Press.
[23] Xia, Y., Tong, H., Li, W. and Zhu, L.-X. (2002). An adaptive estimation of dimension reduction space. J. Roy. Statist. Soc. Ser. B 64 363-410. · Zbl 1091.62028
[24] Ye, G. and Zhou, D. (2008). Learning and approximation by Gaussians on Riemannian manifolds. Adv. Comput. Math. 29 291-310. · Zbl 1156.68045
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.