×

Rate of convergence for geometric inference based on the empirical Christoffel function. (English) Zbl 1490.62095

Summary: We consider the problem of estimating the support of a measure from a finite, independent, sample. The estimators which are considered are constructed based on the empirical Christoffel function. Such estimators have been proposed for the problem of set estimation with heuristic justifications. We carry out a detailed finite sample analysis, that allows us to select the threshold and degree parameters as a function of the sample size. We provide a convergence rate analysis of the resulting support estimation procedure. Our analysis establishes that we may obtain finite sample bounds which are comparable to existing rates for different set estimation procedures. Our results rely on concentration inequalities for the empirical Christoffel function and on estimates of the supremum of the Christoffel-Darboux kernel on sets with smooth boundaries, that can be considered of independent interest.

MSC:

62G07 Density estimation
42C05 Orthogonal functions and polynomials, general theory of nontrigonometric harmonic analysis
62G20 Asymptotic properties of nonparametric inference

Software:

UCI-ml
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] C. Aaron and O. Bodart, Local convex hull support and boundary estimation. J. Multivar. Anal. 147 (2016) 82-101. · Zbl 1334.62032
[2] C.C. Aggarwal and S. Sathe, Theoretical foundations and algorithms for outlier ensembles. ACM Sigkdd Explor. Newsl. 17 (2015) 24-47.
[3] N. Aronszajn, Theory of reproducing kernels. Trans. Am. Math. Soc. 68 (1950) 337-404. · Zbl 0037.20701
[4] M. Berger and B. Gostiaux, Differential Geometry: Manifolds, Curves, and Surfaces: Manifolds, Curves, and Surfaces, Vol. 115, Springer Science & Business Media (2012). · Zbl 0629.53001
[5] A. Berlinet and C. Thomas-Agnan, Reproducing kernel Hilbert spaces in probability and statistics. Springer Science & Business Media (2011). · Zbl 1145.62002
[6] R. Berman, S. Boucksom and D.W. Nyström, Fekete points and convergence towards equilibrium measures on complex manifolds. Acta Math. 207 (2011) 1-27. · Zbl 1241.32030
[7] G. Biau, B. Cadre and B. Pelletier, Exact rates in density support estimation. J. Multivar. Anal. 99 (2008) 2185-2207. · Zbl 1151.62027
[8] L. Bos, Asymptotics for the Christoffel function for Jacobi like weights on a ball in ℝ^m. New Zealand J. Math. 23 (1994) 109. · Zbl 0828.42014
[9] L. Bos, B. Della Vecchia and G. Mastroianni, On the asymptotics of Christoffel functions for centrally symmetric weight functions on the ball in ℝ^d. Rend. Cir. Mat. 52 (1998) 277-290. · Zbl 0910.33008
[10] J. Chevalier, Estimation du support et du contour du support d’une loi de probabilité. Ann. Inst. Henri Poincaré Stat. 12 (1976) 339-364. · Zbl 0372.62036
[11] A. Cholaquidis, A. Cuevas and R. Fraiman, On Poincaré cone property. Ann. Stat. 42 (2014) 255-284. · Zbl 1327.62201
[12] A. Cuevas and R. Fraiman, A plug-in approach to support estimation. Ann. Stat. 25 (1997) 2300-2312. · Zbl 0897.62034
[13] A. Cuevas, R. Fraiman and B. Pateiro-López, On statistical properties of sets fulfilling rolling-type conditions. Adv. Appl. Prob. 44 (2012) 311-329. · Zbl 1252.47089
[14] A. Cuevas, W. González-Manteiga and A. Rodríguez-Casal, Plug-in estimation of general level sets. Aust. New Zealand J. Stat. 48 (2006) 7-19. · Zbl 1108.62036
[15] A. Cuevas and A. Rodríguez-Casal, On boundary estimation. Adv. Appl. Prob. 36 (2004) 340-354. · Zbl 1045.62019
[16] L. Devroye and G.L. Wise, Detection of abnormal behavior via nonparametric estimation of the support. SIAM J. Appl. Math. 38 (1980) 480-488. · Zbl 0479.62028
[17] D. Dua and C. Graff, UCI Machine Learning Repository (2017).
[18] C.F. Dunkl and Y. Xu, Vol. 155 of Orthogonal polynomials of several variables. Cambridge University Press (2014). · Zbl 1317.33001
[19] H. Edelsbrunner, D. Kirkpatrick and R. Seidel, On the shape of a set of points in the plane. IEEE Trans. Inf. Theory 29 (1983) 551-559. · Zbl 0512.52001
[20] J. Geffroy, Sur un probleme d’estimation géométrique. Publ. Inst. Statist. Univ. Paris 13 (1964) 191-210. · Zbl 0129.32301
[21] F. Keller, E. Muller and K. Bohm, HiCS: High contrast subspaces for density-based outlier ranking, in 2012 IEEE 28th international conference on data engineering, IEEE (2012) 1037-1048.
[22] A. Kroó and D. Lubinsky, Christoffel functions and universality in the bulk for multivariate orthogonal polynomials. Can. J. Math. 65 (2013) 600-620. · Zbl 1267.42029
[23] A. Kroó and D. Lubinsky, Christoffel functions and universality on the boundary of the ball. Acta Math. Hung. 140 (2013) 117-133. · Zbl 1340.42064
[24] J.B. Lasserre and E. Pauwels, The empirical Christoffel function with applications in data analysis. Adv. Comput. Math. 45 (2019) 1439-1468. · Zbl 1425.62079
[25] E. Mammen and A.B. Tsybakov, Asymptotical minimax recovery of sets with smooth boundaries. Ann. Stat. 23 (1995) 502-524. · Zbl 0834.62038
[26] S. Marx, E. Pauwels, T. Weisser, D. Henrion and J. Lasserre, Tractable semi-algebraic approximation using Christoffel-Darboux kernel. Preprint arXiv:1904.01833 (2019). · Zbl 1497.41016
[27] I.S. Molchanov, A limit theorem for solutions of inequalities. Scand. J. Stat. 25 (1998) 235-242. · Zbl 0915.60040
[28] T. Patschkowski and A. Rohde, Adaptation to lowest density regions with application to support recovery. Ann. Stat. 44 (2016) 255-287. · Zbl 1331.62222
[29] E. Pauwels and J.B. Lasserre, Sorting out typicality with the inverse moment matrix SOS polynomial, in Advances in Neural Information Processing Systems (2016) 190-198.
[30] E. Pauwels, M. Putinar and J.-B. Lasserre, Data analysis from empirical moments and the Christoffel function. Found. Comput. Math. 21 (2021) 243-273. · Zbl 1461.62234
[31] F. Piazzon, Bernstein Markov properties and applications, Ph.D. thesis, Dipartimento di Matematica, Università degli Studi di Padova (2016).
[32] W. Polonik, Measuring mass concentrations and estimating density contour clusters-an excess mass approach. Ann. Stat. 23 (1995) 855-881. · Zbl 0841.62045
[33] A. Rényi and R. Sulanke, Über die konvexe Hülle von n zufällig gewählten Punkten. Prob. Theory Related Fields 2 (1963) 75-84. · Zbl 0118.13701
[34] P. Rigollet and R. Vert, Optimal rates for plug-in estimators of density level sets. Bernoulli 15 (2009) 1154-1178. · Zbl 1200.62034
[35] A. Rodríguez Casal Set estimation under convexity type assumptions. Ann. Inst. Henri Poincaré, Prob. Stat. 43 (2007) 763-774. · Zbl 1169.62317
[36] H.L. Royden and P. Fitzpatrick, Real analysis, vol. 32. Macmillan New York (1988). · Zbl 1191.26002
[37] A. Singh, C. Scott and R. Nowak, Adaptive Hausdorff estimation of density level sets. Ann. Stat. 37 (2009) 2760-2782. · Zbl 1173.62019
[38] G. Szegö, Vol. 23 of Orthogonal polynomials. American Mathematical Soc. (1939). · JFM 65.0278.03
[39] V. Totik, Asymptotics for Christoffel functions for general measures on the real line. J. d’Anal. Math. 81 (2000) 283-303. · Zbl 0966.42017
[40] A.B. Tsybakov, On nonparametric estimation of density level sets. Ann. Stat. 25 (1997) 948-969. · Zbl 0881.62039
[41] R. Vershynin, Introduction to the non-asymptotic analysis of random matrices. Preprint arXiv:1011.3027 (2010).
[42] G. Walther, Granulometric smoothing. Ann. Stat. 25 (1997) 2273-2299. · Zbl 0919.62026
[43] G. Walther, On a generalization of Blaschke’s rolling theorem and the smoothing of surfaces. Math. Methods Appl. Sci. 22 (1999) 301-316. · Zbl 0933.52003
[44] Y. Xu, Asymptotics of the Christoffel functions on a simplex in ℝ^d. J. Approx. Theory 99 (1999) 122-133. · Zbl 0943.42012
[45] Y. Xu, Summability of Fourier orthogonal series for Jacobi weight on a ball in ℝ^d. Trans. Am. Math. Soc. 351 (1999) 2439-2458. · Zbl 0929.42016
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.