×

A scale-based approach to finding effective dimensionality in manifold learning. (English) Zbl 1320.62115

Summary: The discovering of low-dimensional manifolds in high-dimensional data is one of the main goals in manifold learning. We propose a new approach to identify the effective dimension (intrinsic dimension) of low-dimensional manifolds. The scale space viewpoint is the key to our approach enabling us to meet the challenge of noisy data. Our approach finds the effective dimensionality of the data over all scale without any prior knowledge. It has better performance compared with other methods especially in the presence of relatively large noise and is computationally efficient.

MSC:

62H05 Characterization and structure theory for multivariate probability distributions; copulas
62-07 Data analysis (statistics) (MSC2010)

Software:

fda (R); SiZer
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Balasubramanian, M., Schwartz, E.L., (2002). The Isomap algorithm and topologi al stability., Science , 295 , 7a.
[2] Becker, R., Chambers, J., and Wilks, A., (1988), The New S Language . Belmont, CA: Wadsworth. · Zbl 0642.68003
[3] Bruske, J., Sommer, G. (1998). Intrinsic dimensionality estimation with optimally topology preserving maps., IEEE Trans. on PAMI , 20(5) , 572-575.
[4] Camastra, F., Vinciarell, A., (2002). Estimating the intrisic dimension of data with a fractal-based approach., IEEE Trans. on PAMI , 24(10) , 1404-1407.
[5] Chaudhuri, P., Marron, J.S., (1999). SiZer for exploration of structures in curves., Journal of the American Statistical Association , 94 , 807-823. · Zbl 1072.62556 · doi:10.2307/2669996
[6] Costa, J., Hero, A.O., (2004). Geodisic entropic graphs for dimension and entropy estimation in manifold learning., IEEE Trans. on Signal Processing , · Zbl 1369.68278 · doi:10.1109/TSP.2004.831130
[7] Devroye, L., Györfi, L., Lugosi, G., (1996)., A probabilistic theory of pattern recognition . Springer. · Zbl 0853.68150
[8] DeMers, D., and Cottrell, G., (1993). Nonliear dimensionality reduction., Advances in Neural Information Processing System , 5 , 580-587.
[9] Donoho, D.L., Grimes, C., (2003). Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data., PNAS , 100 , no. 10, 5591-5596. · Zbl 1130.62337 · doi:10.1073/pnas.1031596100
[10] Fukunaga, K., Olsen, D.R., (1971). An algorithm for finding intrinsic dimensionality of data., IEEE Trans. on Computers , C-20 , 176-183. · Zbl 0216.50201 · doi:10.1109/T-C.1971.223208
[11] Gmamadesolam. R., (1977)., Methods for statistical data analysis of multivariate observations . John Wiley & Sons.
[12] Hastie, T. (1984). Principal curves and surfaces. Technical report, Standord University, Dept. of, Statistics.
[13] Hastie, T., Stuetzle. W., (1989). Principal curves., Journal of the American Statistical Association , 84 , no. 406, 502-516. · Zbl 0679.62048 · doi:10.2307/2289936
[14] Irie, b., and Kawato, M., (1990). Acquisition of internal representation by multi-layered perception., IEICE Trans. Inf. & Syst. (Japanese Edition) , vol. J73-D-II , no. 8, 1173-1178.
[15] Jones, M.C., Marron, J.S., and Sheather, S.J., (1996). A brief survey of bandwidth selection for density estimation., Journal of the American Statistical Association , 91 , no. 433, 401-407. · Zbl 0873.62040 · doi:10.2307/2291420
[16] LeBlanc, M., Tibshirani, B., (1994). Adaptive principal surfaces. In, Journal of the American Statisitical Association , 89 , no. 425, 53-64. · Zbl 0795.62057 · doi:10.2307/2291200
[17] Levina, E., Bickel, P.J., (2005). Maximum likelihood estimation of intrisic dimension. In, Advances in NIPS , 17 ,
[18] Lindeberg, T., (1993)., Scale-space theory in computer vision . Kluwer Academic Publishers. · Zbl 0812.68040
[19] Grassberger, P., Procaccia, I., (1983) Measuring the strangeness of strange attactors. Physica, D9 , 189-208. · Zbl 0593.58024 · doi:10.1016/0167-2789(83)90298-1
[20] Ramsay, J.O., Silverman, S.W., (2002)., Applied Functional Data Analysis , Springer, New York. · Zbl 1011.62002
[21] ter Haar Romeny, B.M., (2002)., Front-end vision and multi-scale image analysis . Kluwer Academic Publishers. · Zbl 1001.68149
[22] Roweis, S., Saul, L., (2000). Nonlinear dimensionality reduction by locally linear embedding., Science , 290 , 2323-2326.
[23] Shepare, R.N., (1974). Representation of structure in similarity data: problems and prospects., Psychometrika , vol 39, No. 4 , 373-421. · Zbl 0295.92024 · doi:10.1007/BF02291665
[24] Schölkipf, B., Smola, A., and Müller, K., (1998). Nonlinear component analysis as a kernel eigenvalue problem., Neural Comput. , vol. 10 , no. 5, 1299-1319.
[25] Smith, R.L., (1992). Optimal estimation of fractal dimension., Nonlinear Modeling and Forecasting, SFI in the Sciences of Complexity, Proc. , Vol. XII , Eds. m. Casdagli & S. Eubank, Addison-Wesley, 115-135.
[26] Smith, R.L., (1992). Estimating dimension in noisy chaotic time series., Journal of the Royal Statistical Society Series B-Statistical Methodology , 54 , 329-351. · Zbl 0775.62246
[27] Tenenbaum, J.B., de Silva, V., Langford, J.C., (2000). A global geometric framework for nonlinear dimensionality reduction., Science , 290 , 2319-2322.
[28] Wang, H., Iyer, H., (2006). Application of local linear embedding to nonlinear exploratory latent structure. to apper in, Psychometrika . · Zbl 1286.62110
[29] Wang, X., (2004) A Scale-Based Approach to Finding Effective Dimensionality., Dissertation.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.