×

Fast adaptive estimation of log-additive exponential models in Kullback-Leibler divergence. (English) Zbl 1473.62117

Summary: We study the problem of nonparametric estimation of probability density functions (pdf) with a product form on the domain \(\triangle =\{(x_{1},\ldots,x_{d})\in{\mathbb{R}}^{d}\), \(0\leq x_{1}\leq \cdots\leq x_{d}\leq 1\}\). Such pdf’s appear in the random truncation model as the joint pdf of the observations. They are also obtained as maximum entropy distributions of order statistics with given marginals. We propose an estimation method based on the approximation of the logarithm of the density by a carefully chosen family of basis functions. We show that the method achieves a fast convergence rate in probability with respect to the Kullback-Leibler divergence for pdf’s whose logarithm belong to a Sobolev function class with known regularity. In the case when the regularity is unknown, we propose an estimation procedure using convex aggregation of the log-densities to obtain adaptability. The performance of this method is illustrated in a simulation study.

MSC:

62G07 Density estimation
62G05 Nonparametric estimation
62G20 Asymptotic properties of nonparametric inference
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] R. Abraham, J.-F. Delmas, and H. Guo. Critical multi-type Galton-Watson trees conditioned to be large., arXiv preprint arXiv :1511.01721, 2015. · Zbl 1422.60146
[2] M. Abramowitz and I. A. Stegun., Handbook of mathematical functions: with formulas, graphs, and mathematical tables. Courier Dover Publications, 1970. · Zbl 0171.38503
[3] J. Avérous, C. Genest, and S. C. Kochar. On the dependence structure of order statistics., Journal of multivariate analysis, 94(1):159-171, 2005. · Zbl 1065.62087
[4] A. Barron, L. Birgé, and P. Massart. Risk bounds for model selection via penalization., Probability Theory and Related Fields, 113(3):301-413, 1999. · Zbl 0946.62036
[5] A. R. Barron, L. Gyorfi, and E. C. van der Meulen. Distribution estimation consistent in total variation and in two types of information divergence., Information Theory, IEEE Transactions on, 38(5) :1437-1454, 1992. · Zbl 0765.62007
[6] A. R. Barron and C.-H. Sheu. Approximation of density functions by sequences of exponential families., The Annals of Statistics, 19(3) :1347-1369, 1991. · Zbl 0739.62027
[7] K. Bertin. Asymptotically exact minimax estimation in sup-norm for anisotropic Hölder classes., Bernoulli, 10(5):873-888, 2004. · Zbl 1103.62078
[8] L. Birgé and P. Massart. From model selection to adaptive estimation. In D. Pollard, E. Torgersen, and G. Yang, editors, Festschrift for Lucien Le Cam, pages 55-87. Springer New York, 1997. · Zbl 0920.62042
[9] C. Butucea. Exact adaptive pointwise estimation on Sobolev classes of densities., ESAIM: Probability and Statistics, 5:1-31, 2001. · Zbl 0990.62032
[10] C. Butucea, J.-F. Delmas, A. Dutfoy, and R. Fischer. Nonparametric estimation of distributions of order statistics with application to nuclear engineering. In L. Podofillini, B. Sudret, B. Stojadinovic, E. Zio, and W. Kröger, editors, Safety and Reliability of Complex Engineered Systems: ESREL 2015, pages 2657-2665. CRC Press, 2015.
[11] C. Butucea, J.-F. Delmas, A. Dutfoy, and R. Fischer. Maximum entropy distribution of order statistics with given marginals., Bernoulli, to appear, 2017. · Zbl 1408.62093
[12] C. Butucea, J.-F. Delmas, A. Dutfoy, and R. Fischer. Optimal exponential bounds for aggregation of estimators for the kullback-leibler loss., Electron. J. Statist., 11(1) :2258-2294, 2017. · Zbl 1364.62082
[13] O. Catoni. The mixture approach to universal model selection. Technical report, École Normale Supérieure, 1997. · Zbl 0928.62033
[14] B. R. Crain. An information theoretic approach to approximating a probability distribution., SIAM Journal on Applied Mathematics, 32(2):339-346, 1977. · Zbl 0368.60021
[15] D. L. Donoho, I. M. Johnstone, G. Kerkyacharian, and D. Picard. Density estimation by wavelet thresholding., The Annals of Statistics, pages 508-539, 1996. · Zbl 0860.62032
[16] C. F. Dunkl and Y. Xu., Orthogonal polynomials of several variables, volume 81. Cambridge University Press, 2001. · Zbl 0964.33001
[17] A. Goldenshluger and O. Lepski. Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality., The Annals of Statistics, 39(3) :1608-1632, 2011. · Zbl 1234.62035
[18] I. J. Good. Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables., The Annals of Mathematical Statistics, pages 911-934, 1963. · Zbl 0143.40705
[19] E. Guerre, I. Perrigne, and Q. Vuong. Optimal nonparametric estimation of first-price auctions., Econometrica, 68(3):525-574, 2000. · Zbl 1056.62512
[20] P. Hall. On Kullback-Leibler loss and density estimation., The Annals of Statistics, pages 1491-1519, 1987. · Zbl 0678.62045
[21] T. Herbst. An application of randomly truncated data models in reserving IBNR claims., Insurance: Mathematics and Economics, 25(2):123-131, 1999. · Zbl 0947.62072
[22] P. Joly, D. Commenges, and L. Letenneur. A penalized likelihood approach for arbitrarily censored and truncated data: application to age-specific incidence of dementia., Biometrics, pages 185-194, 1998. · Zbl 1058.62618
[23] G. Kerkyacharian, O. Lepski, and D. Picard. Nonlinear estimation in anisotropic multi-index denoising., Probability Theory and Related Fields, 121(2):137-170, 2001. · Zbl 1010.62029
[24] G. Kerkyacharian, D. Picard, and K. Tribouley. \(L_p\) adaptive density estimation., Bernoulli, pages 229-247, 1996. · Zbl 0858.62031
[25] J.-Y. Koo and W.-C. Kim. Wavelet density estimation by approximation of log-densities., Statistics and Probability Letters, 26(3):271-278, 1996. · Zbl 0843.62040
[26] S. W. Lagakos, L. Barraj, and V. De Gruttola. Nonparametric analysis of truncated survival data, with application to AIDS., Biometrika, 75(3):515-523, 1988. · Zbl 0651.62032
[27] O. Lepski. Multivariate density estimation under sup-norm loss: oracle approach, adaptation and independence structure., The Annals of Statistics, 41(2) :1005-1034, 2013. · Zbl 1360.62158
[28] O. V. Lepski. On a problem of adaptive estimation in Gaussian white noise., Theory of Probability and Its Applications, 35(3):454-466, 1991.
[29] X. Luo and W.-Y. Tsai. Nonparametric estimation of bivariate distribution under right truncation with application to panic disorder., Journal of Statistical Planning and Inference, 139(4) :1559-1568, 2009. · Zbl 1153.62024
[30] D. Lynden-Bell. A method of allowing for known observational selection in small samples applied to 3cr quasars., Monthly Notices of the Royal Astronomical Society, 155(1):95-118, 1971.
[31] P. Rigollet and A. B. Tsybakov. Linear and convex aggregation of density estimators., Mathematical Methods of Statistics, 16(3):260-280, 2007. · Zbl 1231.62057
[32] B. W. Turnbull. The empirical distribution function with arbitrarily grouped, censored and truncated data., J. Roy. Statist. Soc. Ser. B, 38(3):290-295, 1976. · Zbl 0343.62033
[33] X. Wu. Exponential series estimator of multivariate densities., Journal of Econometrics, 156(2):354-366, 2010. · Zbl 1431.62144
[34] Y. Yang. Mixing strategies for density estimation., The Annals of Statistics, 28(1):75-87, 2000. · Zbl 1106.62322
[35] Y. Yang and A. Barron. Information-theoretic determination of minimax rates of convergence., The Annals of Statistics, 27(5) :1564-1599, 1999. · Zbl 0978.62008
[36] T. Zhang. From \(ε \)-entropy to KL-entropy: Analysis of minimum information complexity density estimation., The Annals of Statistics, 34(5) :2180-2210, 2006. · Zbl 1106.62005
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.