zbMATH — the first resource for mathematics

Mixing strategies for density estimation. (English) Zbl 1106.62322
Summary: General results on adaptive density estimation are obtained with respect to any countable collection of estimation strategies under Kullback-Leibler and squared \(L_2\) losses. It is shown that without knowing which strategy works best for the underlying density, a single strategy can be constructed by mixing the proposed ones to be adaptive in terms of statistical risks. A consequence is that under some mild conditions, an asymptotically minimax-rate adaptive estimator exists for a given countable collection of density classes; that is, a single estimator can be constructed to be simultaneously minimax-rate optimal for all the function classes being considered. A demonstration is given for high-dimensional density estimation on \([{}0,1]{}^d\) where the constructed estimator adapts to smoothness and interaction-order over some piecewise Besov classes and is consistent for all the densities with finite entropy.

62G07 Density estimation
62G20 Asymptotic properties of nonparametric inference
62C20 Minimax procedures in statistical decision theory
62B10 Statistical aspects of information-theoretic topics
Full Text: DOI Euclid
[1] Barron, A. R. (1987). Are Bayes rules consistent in information? In Open Problems in Communication and Computation (T. M. Cover and B. Gopinath, eds.) 85-91. Springer, New York.
[2] Barron, A. R. (1988). The convergence in information of probability density estimators. Presented at the IEEE International Symposium on Information Theory, Kobe, Japan. Barron, A. R. and Cover, T. M. (1991), Minimum complexity density estimation. IEEE Trans. Inform. Theory 37 1034-1054. Barron, A. R., Birgé, L. and Massart, P. (1999) Risk bounds for model selection via penalization. Probability Theory and Related Fields 113 301-413. · Zbl 0946.62036 · doi:10.1007/s004400050210
[3] Barron, A. R., Gy örfi, L. and van de Meulen, E. C. (1992). Distribution estimation consistent in total variation and in two types of information divergence. IEEE Trans. Inform. Theory 38 1437-1454. · Zbl 0765.62007 · doi:10.1109/18.149496
[4] Barron, A. R. and Xie, Q. (1996). Asymptoticminimax regret for data compression, gambling, and prediction.
[5] Berger, J. O. and Pericchi, L. R. (1996). The intrinsic Bayes factor for model selection and prediction. J. Amer. Statist. Assoc. 91 109-122. JSTOR: · Zbl 0870.62021 · doi:10.2307/2291387 · links.jstor.org
[6] Birgé, L. and Massart, P. (1996). From model selection to adaptive estimation. In Research Papers in Probability and Statistics: Festschrift for Lucien Le Cam (D. Pollard, E. Torgersen and G. Yang, eds.) 55-91. Springer, New York. · Zbl 0920.62042
[7] Brown, L. D. and Low, M. G. (1996). A constrained risk inequality with applications to nonparametricfunctional estimation. Ann. Statis. 24 2524-2535. · Zbl 0867.62023 · doi:10.1214/aos/1032181166
[8] Catoni, O. (1997). The mixture approach to universal model selection. Technical Report, LIENS97-22, Ecole Normale Superieure, Paris, France. · Zbl 0928.62033 · www.dmi.ens.fr
[9] Clarke, B. and Barron, A. R. (1990). Information-theoreticasymptotics of Bayes methods. IEEE Trans. Inform. Theory 36 453-471. · Zbl 0709.62008 · doi:10.1109/18.54897
[10] DeVore, R. A. and Lorentz, G. G. (1993). Constructive Approximation. Springer, New York. · Zbl 0797.41016
[11] Devroye, L., Gy örfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York. · Zbl 0853.68150
[12] Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1996). Density estimation by wavelet thresholding. Ann. Statist. 24 508-539. · Zbl 0860.62032 · doi:10.1214/aos/1032894451
[13] Efroimovich, S. Yu. (1995). Nonparametricestimation of a density of unknown smoothness. Theory Probab. Appl. 30 557-568.
[14] Efroimovich, S. Yu. and Pinsker, M. S. (1984). A self-educating nonparametric filtration algorithm. Automat. Remote Control 45 58-65.
[15] Feder, M. and Merhav, M. (1996). Hierarchical universal coding. IEEE Trans. Inform. Theory 42 1354-1364. · Zbl 0860.94015 · doi:10.1109/18.532877
[16] Härdle, W. and Marron, J. S. (1985). Optimal bandwidth selection in nonparametric regression function estimation. Ann. Statist. 13 1465-1481. · Zbl 0594.62043 · doi:10.1214/aos/1176349748
[17] Haussler, D. and Opper, M. (1997). Mutual information, metricentropy and cumulative relative entropy risk. Ann. Statist. 25 2451-2492. · Zbl 0920.62007 · doi:10.1214/aos/1030741081
[18] Huber, P. J. (1985). Projection pursuit. Ann. Statist. 13 435-475. · Zbl 0595.62059 · doi:10.1214/aos/1176349519
[19] Juditsky, A. (1997). Wavelet estimators: adapting to unknown smoothness. Math. Methods Statist. 6 1-25. · Zbl 0871.62039
[20] Juditsky, A. and Nemirovski, A. (1996). Functional aggregation for nonparametric estimation. Publication Interne, IRISA 993. · Zbl 1105.62338
[21] Kass, R. E. and Raftery, A. E. (1995). Bayes factors. J. Amer. Statist. Assoc. 90 773-795. · Zbl 0846.62028 · doi:10.2307/2291091
[22] Lepskii, O. V. (1991). Asymptotically minimax adaptive estimation I: upper bounds. Optimally adaptive estimates. Theory Probab. Appl. 36 682-697. · Zbl 0776.62039 · doi:10.1137/1136085
[23] Lugosi, G. and Nobel, A. (1999). Adaptive model selection using empirical complexities. Ann. Statist. 27 1830-1864. · Zbl 0962.62034 · doi:10.1214/aos/1017939241
[24] Triebel, H. (1975). Interpolation properties of -entropy and diameters. Geometriccharacteristics of embedding for function spaces of Sobolev-Besov type. Mat. Sbornik 98 27-41.
[25] Willems, F. M. J., Shtarkov, Y. M. and Tjalkens, T. J. (1998). The context-tree weighting method: basicproperties. IEEE Trans. Inform. Theory 41 653-664. · Zbl 0837.94011 · doi:10.1109/18.382012
[26] Yang, Y. (1996). Minimax optimal density estimation. Ph.D. dissertation, Dept. Statistics, Yale Univ.
[27] Yang, Y. (1997). On adaptive function estimation. Technical Report 30, Dept. Statistics, Iowa State Univ.
[28] Yang, Y. (1999). Model selection for nonparametric regression. Statist. Sinica 9 475-499. · Zbl 0921.62051
[29] Yang, Y. and Barron, A. R. (1998). An asymptotic property of model selection criteria. IEEE Trans. Inform. Theory 44 95-116. · Zbl 0949.62041 · doi:10.1109/18.650993
[30] Yang, Y. and Barron, A. R. (1999). Information-theoreticdetermination of minimax rates of convergence. Ann. Statist. 27 1564-1599. · Zbl 0978.62008 · doi:10.1214/aos/1017939142
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.