zbMATH — the first resource for mathematics

Examples
Geometry Search for the term Geometry in any field. Queries are case-independent.
Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact.
"Topological group" Phrases (multi-words) should be set in "straight quotation marks".
au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted.
Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff.
"Quasi* map*" py: 1989 The resulting documents have publication year 1989.
so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14.
"Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic.
dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles.
py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses).
la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

Operators
a & b logic and
a | b logic or
!ab logic not
abc* right wildcard
"ab c" phrase
(ab c) parentheses
Fields
any anywhere an internal document identifier
au author, editor ai internal author identifier
ti title la language
so source ab review, abstract
py publication year rv reviewer
cc MSC code ut uncontrolled term
dt document type (j: journal article; b: book; a: book article)
Information-theoretic determination of minimax rates of convergence. (English) Zbl 0978.62008
From the introduction: The metric entropy structure of a density class determines the minimax rate of convergence of density estimators. Here we prove such results using new direct metric entropy bounds on the mutual information that arises by application of Fano’s information inequality in the development of lower bounds characterizing the optimal rate. No special construction is required for each density class. We here study global measures of loss such as integrated squared error, squared Hellinger distance or Kullback-Leibler (K-L) divergence in nonparametric curve estimation problems.

MSC:
62B10Statistical information theory
62G07Density estimation
62C20Statistical minimax procedures
94A17Measures of information, entropy
WorldCat.org
Full Text: DOI
References:
[1] Ball, K. and Pajor, A. (1990). The entropy of convex bodies with ”few” extreme points. In Geometry of Banach Spaces 26-32. Cambridge Univ. Press. · Zbl 0746.60005
[2] Barron, A. R. (1987). Are Bayes rules consistent in information? In Open Problems in Communication and Computation (T. M. Cover and B. Gopinath, eds.) 85-91. Springer, New York.
[3] Barron, A. R. (1991). Neural net approximation. In Proceedings of the Yale Workshop on Adaptive Learning Systems (K. Narendra, ed.) Yale University. · Zbl 0739.62001
[4] Barron, A. R. (1993). Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inform. Theory 39 930-945. · Zbl 0818.68126 · doi:10.1109/18.256500
[5] Barron, A. R. (1994). Approximation and estimation bounds for artificial neural networks. Machine Learning 14 115-133. · Zbl 0818.68127
[6] Barron, A. R., Birgé, L. and Massart, P. (1999). Riskbounds for model selection via penalization. Probab. Theory Related Fields 113 301-413. · Zbl 0946.62036 · doi:10.1007/s004400050210
[7] Barron, A. R. and Cover, T. M. (1991). Minimum complexity density estimation. IEEE Trans. Inform. Theory 37 1034-1054. · Zbl 0743.62003 · doi:10.1109/18.86996
[8] Barron, A. R. and Hengartner, N. (1998). Information theory and superefficiency. Ann. Statist. 26 1800-1825. · Zbl 0932.62005 · doi:10.1214/aos/1024691358
[9] Barron, A. R. and Sheu, C.-H. (1991). Approximation of density functions by sequences of exponential families. Ann. Statist. 19 1347-1369. · Zbl 0739.62027 · doi:10.1214/aos/1176348252
[10] Bickel, P. J. and Ritov, Y. (1988). Estimating integrated squared density derivatives: sharp best order of convergence estimates. Sankhy?a Ser. A 50 381-393. · Zbl 0676.62037
[11] Birgé, L. (1983). Approximation dans les espaces metriques et theorie de l’estimation. Z. Wahrsch. Verw. Gebiete 65 181-237. · Zbl 0506.62026 · doi:10.1007/BF00532480
[12] Birgé, L. (1986). On estimating a density using Hellinger distance and some other strange facts. Probab. Theory Related Fields 71 271-291. · Zbl 0561.62029 · doi:10.1007/BF00332312
[13] Birgé, L. and Massart, P. (1993). Rates of convergence for minimum contrast estimators. Probab. Theory Related Fields 97 113-150. · Zbl 0805.62037 · doi:10.1007/BF01199316
[14] Birgé, L. and Massart, P. (1994). Minimum contrast estimators on sieves. Technical report, Univ. Paris-Sud. · Zbl 0954.62033
[15] Birgé, L. and Massart, P. (1995). Estimation of integral functionals of a density. Ann. Statist. 23 11-29. · Zbl 0848.62022 · doi:10.1214/aos/1176324452
[16] Birgé, L. and Massart, P. (1996). From model selection to adaptive estimation. In Research Papers in Probability and Statistics: Festschrift in Honor of Lucien Le Gam (D. Pollard, E. Torgersen and G. Yang, eds.) 55-87. Springer, New York. · Zbl 0920.62042
[17] Birman, M. S. and Solomjak, M. (1974). Quantitative analysis in Sobolev embedding theorems and application to spectral theory. Tenth Math. School Kiev 5-189. · Zbl 0426.46019
[18] Bretagnolle, J. and Huber, C. (1979). Estimation des densites: risque minimax. Z. Wahrsch. Verw. Gebiete 47 119-137. · Zbl 0413.62024 · doi:10.1007/BF00535278
[19] Carl, B. (1981). Entropy numbers of embedding maps between Besov spaces with an application to eigenvalue problems. Proc. Roy. Soc. Edinburgh 90A 63-70. · Zbl 0508.47041 · doi:10.1017/S0308210500015341
[20] Cencov, N. N. (1972). Statistical Decision Rules and Optimal Inference. Nauka, Moscow; English
[21] translation in Amer. Math. Soc. Transl. 53 (1982).
[22] Clarke, B. and Barron, A. R. (1990). Information-theoretic asymptotics of Bayes methods. IEEE Trans. Inform. Theory 36 453-471. · Zbl 0709.62008 · doi:10.1109/18.54897
[23] Clarke, B. and Barron, A. R. (1994). Jeffrey’s prior is asymptotically least favorable under entropy risk. J. Statist. Plann. Inference 41 37-60. · Zbl 0820.62006 · doi:10.1016/0378-3758(94)90153-8
[24] Cover, T. M. and Thomas, J. A. (1991). Elements of Information Theory. Wiley, New York. · Zbl 0762.94001
[25] Cox, D. D (1988). Approximation of least squares regression on nested subspaces. Ann. Statist. 16 713-732. · Zbl 0669.62047 · doi:10.1214/aos/1176350830
[26] Davisson, L. (1973). Universal noiseless coding. IEEE Trans. Inform. Theory 19 783-795. · Zbl 0283.94005 · doi:10.1109/TIT.1973.1055092
[27] Davisson, L. and Leon-Garcia, A. (1980). A source matching approach to finding minimax codes. IEEE Trans. Inform. Theory 26 166-174. · Zbl 0431.94026 · doi:10.1109/TIT.1980.1056167
[28] DeVore, R. A. and Lorentz, G. G. (1993). Constructive Approximation. Springer, New York. · Zbl 0797.41016
[29] Devroye, L. (1987). A Course in Density Estimation. Birkhäuser, Boston. · Zbl 0617.62043
[30] Donoho, D. L. (1993). Unconditional bases are optimal bases for data compression and for statistical estimation. Appl. Comput. Harmon. Anal. 1 100-115. · Zbl 0796.62083 · doi:10.1006/acha.1993.1008
[31] Donoho, D. L. (1996). Unconditional bases and bit-level compression. Technical report 498, Dept. Statistics, Stanford Univ. · Zbl 0936.62004 · doi:10.1006/acha.1996.0032
[32] Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1996). Density estimation by wavelet thresholding. Ann. Statist. 24 508-539. · Zbl 0860.62032 · doi:10.1214/aos/1032894451
[33] Donoho, D. L. and Liu, R. C. (1991). Geometrizing rates of convergence II. Ann. Statist. 19 633-667. · Zbl 0754.62028 · doi:10.1214/aos/1176348114
[34] Dudley, R. M. (1987). Universal Donsker classes and metric entropy. Ann. Probab. 15 1306-1326. · Zbl 0631.60004 · doi:10.1214/aop/1176991978
[35] Edmunds, D. E. and Triebel, H. (1987). Entropy numbers and approximation numbers in function spaces. Proc. London Math. Soc. 58 137-152. · Zbl 0629.46034 · doi:10.1112/plms/s3-58.1.137
[36] Efroimovich, S. Yu. and Pinsker, M. S. (1982). Estimation of square-integrable probability density of a random variable. Problemy Peredachi Informatsii 18 19-38. · Zbl 0514.62045
[37] Fano, R. M. (1961). Transmission of Information: A Statistical Theory of Communication. MIT Press. · Zbl 0151.24402
[38] Farrell, R. (1972). On the best obtainable asymptotic rates of convergence in estimation of a density function at a point. Ann. Math. Statist. 43 170-180. · Zbl 0238.62049 · doi:10.1214/aoms/1177692711
[39] Hasminskii, R. Z. (1978). A lower bound on the risks of nonparametric estimates of densities in the uniform metric. Theory Probab. Appl. 23 794-796. · Zbl 0449.62032 · doi:10.1137/1123095
[40] Hasminskii, R. Z. and Ibragimov, I. A. (1990). On density estimation in the view of Kolmogorov’s ideas in approximation theory. Ann. Statist. 18 999-1010. · Zbl 0705.62039 · doi:10.1214/aos/1176347736
[41] Haussler, D. (1997). A general minimax result for relative entropy. IEEE Trans. Inform. Theory 40 1276-1280. · Zbl 0878.94038 · doi:10.1109/18.605594
[42] Haussler, D. and Opper, M. (1997). Mutual information, metric entropy and cumulative relative entropy risk. Ann. Statist. 25 2451-2492. · Zbl 0920.62007 · doi:10.1214/aos/1030741081
[43] Ibragimov, I. A. and Hasminskii, R. Z. (1977). On the estimation of an infinite-dimensional parameter in Gaussian white noise. Soviet Math. Dokl. 18 1307-1309. · Zbl 0389.62023
[44] Ibragimov, I. A. and Hasminskii, R. Z. (1978). On the capacity in communication by smooth signals. Soviet Math. Dokl. 19 1043-1047. · Zbl 0432.94009
[45] Jones, L. K. (1992). A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural networktraining. Ann. Statist. 20 608-613. · Zbl 0746.62060 · doi:10.1214/aos/1176348546
[46] Kolmogorov, A. N. and Tihomirov, V. M. (1959). -entropy and -capacity of sets in function spaces. Uspekhi Mat. Nauk 14 3-86. · Zbl 0090.33503
[47] Koo, J. Y. and Kim, W. C. (1996). Wavelet density estimation by approximation of log-densities. Statist. Probab. Lett 26 271-278. · Zbl 0843.62040 · doi:10.1016/0167-7152(95)00020-8
[48] Le Cam, L. M. (1973). Convergence of estimates under dimensionality restrictions. Ann. Statist. 1, 38-53. · Zbl 0255.62006 · doi:10.1214/aos/1193342380
[49] Le Cam, L. M. (1986). Asymptotic Methods in Statistical Decision Theory. Springer, New York. · Zbl 0605.62002
[50] Lorentz, G. G. (1966). Metric entropy and approximation. Bull. Amer. Math. Soc. 72 903-937. · Zbl 0158.13603 · doi:10.1090/S0002-9904-1966-11586-0
[51] Lorentz, G. G., Golitschek, M. v. and Makovoz, Y. (1996). Constructive Approximation: Advanced Problems. Springer, New York. · Zbl 0910.41001
[52] Makovoz, Y. (1996). Random approximants and neural networks. J. Approx. Theory 85 98-109. · Zbl 0857.41024 · doi:10.1006/jath.1996.0031
[53] Mitjagin, B. S. (1961). The approximation dimension and bases in nuclear spaces. Uspekhi Mat. Nauk 16 63-132. · Zbl 0104.08601 · doi:10.1070/rm1961v016n04ABEH004109
[54] Nemirovskii, A. (1985). Nonparametric estimation of smooth regression functions. J. Comput. System. Sci. 23 1-11. · Zbl 0604.62033
[55] Nemirovskii, A., Polyak, B. T. and Tsybakov, A. B. (1985). Rates of convergence of nonparametric estimates of maximum-likelihood type. Probl. Peredachi Inf. 21 17-33. · Zbl 0616.62048
[56] Pollard, D. (1993). Hypercubes and minimax rates of convergence.
[57] Rissanen, J. (1984). Universal coding, information, prediction, and estimation. IEEE Trans. Inform. Theory 30 629-636. · Zbl 0574.62003 · doi:10.1109/TIT.1984.1056936
[58] Rissanen, J., Speed, T. and Yu, B. (1992). Density estimation by stochastic complexity. IEEE Trans. Inform. Theory 38 315-323. · Zbl 0743.62004 · doi:10.1109/18.119689
[59] Smoljak, S. A. (1960). The -entropy of some classes E k s B and W s B in the L2 metric. Dokl. Akad. Nauk SSSR 131 30-33.
[60] Stone, C. J. (1982). Optimal global rates of convergence for nonparametric regression. Ann. Statist. 10 1040-1053. · Zbl 0511.62048 · doi:10.1214/aos/1176345969
[61] Temlyakov, V. N. (1989). Estimation of the asymptotic characteristics of classes of functions with bounded mixed derivative or difference. Trudy Mat. Inst. Steklov 189 162-197. · Zbl 0719.46021
[62] Triebel, H. (1975). Interpolation properties of -entropy and diameters. Geometric characteristics of embedding for function spaces of Sobolev-Besov type. Mat. Sb. 98 27-41. · Zbl 0312.46043
[63] Van de Geer, S. (1990). Hellinger consistency of certain nonparametric maximum likelihood estimates. Ann. Statist. 21 14-44. · Zbl 0779.62033 · doi:10.1214/aos/1176349013
[64] Wong, W. H. and Shen, X. (1995). Probability inequalities for likelihood ratios and convergence rates of sieve MLEs. Ann. Statist. 23 339-362. · Zbl 0829.62002 · doi:10.1214/aos/1176324524
[65] Yang, Y. (1999). Model selection for nonparametric regression. Statist. Sinica 9 475-500. · Zbl 0921.62051
[66] Yang, Y. (1999). Minimax nonparametric classification I: rates of convergence. IEEE Trans. Inform. Theory 45 2271-2284. · Zbl 0962.62026 · doi:10.1109/18.796368
[67] Yang, Y. and Barron, A. R. (1997). Information-theoretic determination of minimax rates of convergence. Technical Report 28, Dept. Statistics, Iowa State Univ. · Zbl 0978.62008
[68] Yang, Y. and Barron, A. R. (1998). An asymptotic property of model selection criteria. IEEE Trans. Inform. Theory 44 95-116. · Zbl 0949.62041 · doi:10.1109/18.650993
[69] Yatracos, Y. G. (1985). Rates of convergence of minimum distance estimators and Kolmogorov’s entropy. Ann. Statist. 13 768-774. · Zbl 0576.62057 · doi:10.1214/aos/1176349553
[70] Yatracos, Y. G. (1988). A lower bound on the error in nonparametric regression type problems. Ann. Statist. 16 1180-1187. · Zbl 0651.62028 · doi:10.1214/aos/1176350954
[71] Yu, B. (1996). Assouad, Fano, and Le Cam. In Research Papers in Probability and Statistics: Festschrift in Honor of Lucien Le Cam (D. Pollard, E. Torgersen and G. Yang, eds.) 423-435. Springer, New York. · Zbl 0896.62032