×

Deconvolution for the Wasserstein metric and geometric inference. (English) Zbl 1274.62363

Summary: Recently, Chazal, Cohen-Steiner and Mérigot have defined a distance function to measures to answer geometric inference problems in a probabilistic setting. According to their result, the topological properties of a shape can be recovered by using the distance to a known measure \(\nu\), if \(\nu\) is close enough to a measure \(\mu\) concentrated on this shape. Here, close enough means that the Wasserstein distance \(W_{2}\) between \(\mu\) and \(\nu\) is sufficiently small. Given a point cloud, a natural candidate for \(\nu\) is the empirical measure \(\mu _{n}\). Nevertheless, in many situations the data points are not located on the geometric shape but in the neighborhood of it, and \(\mu _{n}\) can be too far from \(\mu\). In a deconvolution framework, we consider a slight modification of the classical kernel deconvolution estimator, and we give a consistency result and rates of convergence for this estimator. Some simulated experiments illustrate the deconvolution method and its application to geometric inference on various shapes and with various noise distributions.

MSC:

62H12 Estimation in multivariate analysis
60B10 Convergence of probability measures
28A33 Spaces of measures, convergence of measures
PDFBibTeX XMLCite
Full Text: DOI Euclid

References:

[1] Bergström, H. (1952). On some expansions of stable distribution functions., Ark. Mat. 2 375-378. · Zbl 0048.36001 · doi:10.1007/BF02591503
[2] Biau, G., Cadre, B. and Pelletier, B. (2008). Exact rates in density support estimation., J. Multivariate Anal. 99 2185-2207. · Zbl 1151.62027 · doi:10.1016/j.jmva.2008.02.021
[3] Butucea, C. and Matias, C. (2005). Minimax estimation of the noise level and of the deconvolution density in a semiparametric convolution model., Bernoulli 11 309-340. · Zbl 1063.62044 · doi:10.3150/bj/1116340297
[4] Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density., J. Amer. Statist. Assoc. 83 1184-1186. · Zbl 0673.62033 · doi:10.2307/2290153
[5] Chazal, F., Cohen-Steiner, D. and Lieutier, A. (2009). A Sampling Theory for Compact Sets in Euclidean Spaces., Discrete Comput Geom 41 461-479. · Zbl 1165.68061 · doi:10.1007/s00454-009-9144-8
[6] Chazal, F., Cohen-Steiner, D. and Mérigot, Q. Geometric inference for probability measures., J. Foundations of Computational Mathematics . · Zbl 1230.62074 · doi:10.1007/s10208-011-9098-0
[7] Chazal, F. and Lieutier, A. (2008). Smooth Manifold Reconstruction from Noisy and Non Uniform Approximation with Guarantees., Comp. Geom: Theory and Applications 40 156-170. · Zbl 1153.65316 · doi:10.1016/j.comgeo.2007.07.001
[8] Comte, F. and Lacour, C. (2011). Data driven density estimation in presence of unknown convolution operator., J. Royal Stat. Soc., Ser B 73 601-627. · Zbl 1226.62034 · doi:10.1111/j.1467-9868.2011.00775.x
[9] Cuevas, A., Febrero, M. and Fraiman, R. (2000). Estimating the number of clusters., Canad. J. Statist. 28 367-382. · Zbl 0981.62054 · doi:10.2307/3315985
[10] Cuevas, A., Fraiman, R. and Rodríguez-Casal, A. (2007). A nonparametric approach to the estimation of lengths and surface areas., Ann. Statist. 35 1031-1051. · Zbl 1124.62017 · doi:10.1214/009053606000001532
[11] Cuevas, A. and Fraiman, R. (2010). Set estimation. In, New perspectives in stochastic geometry 374-397. Oxford Univ. Press, Oxford. · Zbl 1192.62164
[12] Delaigle, A. and Gijbels, I. (2006). Estimation of boundary and discontinuity points in deconvolution problems., Statist. Sinica 16 773-788. · Zbl 1107.62029
[13] Delaigle, A. and Hall, P. (2006). On optimal kernel choice for deconvolution., Statist. Probab. Lett. 76 1594-1602. · Zbl 1099.62035 · doi:10.1016/j.spl.2006.04.016
[14] Devroye, L. (1989). Consistent deconvolution in density estimation., Canad. J. Statist. 17 235-239. · Zbl 0679.62029 · doi:10.2307/3314852
[15] Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2009). On the path density of a gradient field., Ann. Statist. 37 3236-3271. · Zbl 1191.62062 · doi:10.1214/08-AOS671
[16] Genovese, C. R., Perone-Pacifico, M., Verdinelli, I. and Wasserman, L. (2010). The Geometry of Nonparametric Filament Estimation., · Zbl 1261.62030
[17] Hall, P. and Simar, L. (2002). Estimating a changepoint, boundary, or frontier in the presence of observation error., J. Amer. Statist. Assoc. 97 523-534. · Zbl 1073.62521 · doi:10.1198/016214502760047050
[18] Hartigan, J. A. (1975)., Clustering algorithms . John Wiley & Sons, New York-London-Sydney Wiley Series in Probability and Mathematical Statistics. · Zbl 0372.62040
[19] Hastie, T. and Stuetzle, W. (1989). Principal curves., J. Amer. Statist. Assoc. 84 502-516. · Zbl 0679.62048 · doi:10.2307/2289936
[20] Horowitz, J. and Karandikar, R. L. (1994). Mean rates of convergence of empirical measures in the Wasserstein metric., J. Comput. Appl. Math. 55 261-273. · Zbl 0819.60031 · doi:10.1016/0377-0427(94)90033-7
[21] Koldobsky, A. (2005)., Fourier analysis in convex geometry . Mathematical Surveys and Monographs 116 . American Mathematical Society, Providence, RI. · Zbl 1082.52002
[22] Koltchinskii, V. I. (2000). Empirical geometry of multivariate data: a deconvolution approach., Ann. Statist. 28 591-629. · Zbl 1105.62345 · doi:10.1214/aos/1016218232
[23] Li, T. and Vuong, Q. (1998). Nonparametric estimation of the measurement error model using multiple indicators., J. Multivariate Anal. 65 139-165. · Zbl 1127.62323 · doi:10.1006/jmva.1998.1741
[24] Meister, A. (2004). On the effect of misspecifying the error density in a deconvolution problem., Canad. J. Statist. 32 439-449. · Zbl 1059.62034 · doi:10.2307/3316026
[25] Meister, A. (2006a). Support estimation via moment estimation in presence of noise., Statistics 40 259-275. · Zbl 1098.62038 · doi:10.1080/02331880600723101
[26] Meister, A. (2006b). Estimating the support of multivariate densities under measurement error., J. Multivariate Anal. 97 1702-1717. · Zbl 1099.62051 · doi:10.1016/j.jmva.2005.04.004
[27] Meister, A. (2007). Deconvolving compactly supported densities., Math. Methods Statist. 16 63-76. · Zbl 1283.62078 · doi:10.3103/S106653070701005X
[28] Meister, A. (2009)., Deconvolution problems in nonparametric statistics . Lecture Notes in Statistics 193 . Springer-Verlag. · Zbl 1178.62028 · doi:10.1007/978-3-540-87557-4
[29] Neumann, M. H. (1997). On the effect of estimating the error density in nonparametric deconvolution., J. Nonparametr. Statist. 7 307-330. · Zbl 1003.62514 · doi:10.1080/10485259708832708
[30] Niyogi, P., Smale, S. and Weinberger, S. (2011). A Topological View of Unsupervised Learning from Noisy Data., SIAM Journal on Computing 40 646-663. · Zbl 1230.62085 · doi:10.1137/090762932
[31] Petrunin, A. (2007). Semiconcave functions in Alexandrov’s geometry. In, Surveys in differential geometry. Vol. XI 137-201. Int. Press, Somerville, MA. · Zbl 1166.53001
[32] Rachev, S. T. (1991)., Probability metrics and the stability of stochastic models . Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics . John Wiley & Sons Ltd., Chichester. · Zbl 0744.60004
[33] Rachev, S. T. and Rüschendorf, L. (1998)., Mass transportation problems. Vol. II . Probability and its Applications . Springer-Verlag. · Zbl 0990.60500
[34] Schwarz, M. and Van Bellegem, S. (2010). Consistent density deconvolution under partially known error distribution., Statist. Probab. Lett. 80 236-241. · Zbl 1180.62052 · doi:10.1016/j.spl.2009.10.012
[35] Stefanski, L. and Carroll, R. J. (1990). Deconvoluting kernel density estimators., Statistics 21 169-184. · Zbl 0697.62035 · doi:10.1080/02331889008802238
[36] Villani, C. (2008)., Optimal Transport: Old and New . Grundlehren Der Mathematischen Wissenschaften . Springer-Verlag. · Zbl 1156.53003
[37] Zolotarev, V. M. (1978). Pseudomoments., Teor. Verojatnost. i Primenen. 23 284-294. · Zbl 0421.60006
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.