×

Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities. (English) Zbl 1043.62025

Summary: We study the rates of convergence of the maximum likelihood estimator (MLE) and posterior distribution in density estimation problems, where the densities are location or location-scale mixtures of normal distributions with the scale parameter lying between two positive numbers. The true density is also assumed to lie in this class with the true mixing distribution either compactly supported or having sub-Gaussian tails. We obtain bounds for Hellinger bracketing entropies for this class, and from these bounds, we deduce the convergence rates of (sieve) MLEs in the Hellinger distance. The rate turns out to be \((\log n)^\kappa /\sqrt{n}\), where \(\kappa \geq 1\) is a constant that depends on the type of mixtures and the choice of the sieve. Next, we consider a Dirichlet mixture of normals as a prior on the unknown density. We estimate the prior probability of a certain Kullback-Leibler type neighborhood and then invoke a general theorem that computes the posterior convergence rate in terms the growth rate of the Hellinger entropy and the concentration rate of the prior. The posterior distribution is also seen to converge at the rate \((\log n)^\kappa /\sqrt{n}\), where \(\kappa\) now depends on the tail behavior of the base measure of the Dirichlet process.

MSC:

62G07 Density estimation
62F12 Asymptotic properties of parametric estimators
62F15 Bayesian inference
62B10 Statistical aspects of information-theoretic topics
62G20 Asymptotic properties of nonparametric inference
Full Text: DOI

References:

[1] Banfield, J. and Raftery, A. (1993). Model based Gaussian and non-Gaussian clustering. Biometrics 49 803-821. JSTOR: · Zbl 0794.62034 · doi:10.2307/2532201
[2] Barron, A., Schervish, M. and Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems. Ann. Statist. 27 536-561. · Zbl 0980.62039 · doi:10.1214/aos/1018031206
[3] Birgé, L. and Massart, P. (1998). Minimum contract estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4 329-375. · Zbl 0954.62033 · doi:10.2307/3318720
[4] Escobar, M. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577-588. JSTOR: · Zbl 0826.62021 · doi:10.2307/2291069
[5] Ferguson, T. S. (1967). Mathematical Statistics: A Decision Theoretic Approach. Academic Press, New York. · Zbl 0153.47602
[6] Ferguson, T. S. (1983). Bayesian density estimation by mixtures of Normal distributions. In Recent Advances in Statistics (M. Rizvi, J. Rustagi and D. Siegmund, eds.) 287-302. Academic Press, New York. · Zbl 0557.62030
[7] Geman, S. and Hwang, C. (1982). Nonparametric maximum likelihood estimation by the method of sieves. Ann. Statist. 10 401-414. · Zbl 0494.62041 · doi:10.1214/aos/1176345782
[8] Genovese, C. and Wasserman, L. (2000). Rates of convergence for the Gaussian mixture sieve. Ann. Statist. 28 1105-1127. Ghosal, S., Ghosh, J. K. and Ramamoorthi, R. V. (1999a). Posterior consistency of Dirichlet mixtures in density estimation. Ann. Statist. 27 143-158. Ghosal, S., Ghosh, J. K. and Ramamoorthi, R. V. (1999b). Consistency issues in Bayesian Nonparametrics. In Asymptotics, Nonparametrics and Time Series: A Tribute to Madan Lal Puri (S. Ghosh, ed.) 639-668. Dekker, New York. · Zbl 1105.62333 · doi:10.1214/aos/1015956709
[9] Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions. Ann. Statist. 28 500-531. · Zbl 1105.62315 · doi:10.1214/aos/1016218228
[10] Grenander, U. (1981). Abstract Inference. Wiley, New York. · Zbl 0505.62069
[11] Ibragimov, I. A. (2001). Estimation of analytic functions. In State of the Art in Probability and Statistics. Festschrift for W. R. van Zwet. IMS, Hayward, CA. · Zbl 0910.62036 · doi:10.1214/lnms/1215090078
[12] Ibragimov, I. A. and Khasminskii, R. Z. (1982). An estimate of the density of a distribution belonging to a class of entire functions. Theory Probab. Appl. 27 514-524 (in Russian). · Zbl 0495.62047
[13] Kolmogorov, A. N. and Tihomirov. V. M. (1961). -entropy and -capacity of sets in function spaces. Amer. Math. Soc. Transl. Ser. 2 17 277-364. [Translated from Russian (1959) Uspekhi Mat. Nauk 14 3-86.] · Zbl 0133.06703
[14] Lindsay, B. (1995). Mixture Models: Theory, Geometry and Applications. IMS, Hayward, CA. · Zbl 1163.62326
[15] Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates I: Density estimates. Ann. Statist. 12 351-357. · Zbl 0557.62036 · doi:10.1214/aos/1176346412
[16] McLachlan, G. and Basford, K. (1988). Mixture Models: Inference and Applications to Clustering. Dekker, New York. · Zbl 0697.62050
[17] Priebe, C. E. (1994). Adaptive mixtures. J. Amer. Statist. Assoc. 89 796-806. JSTOR: · Zbl 0825.62445 · doi:10.2307/2290905
[18] Robert, C. (1996). Mixtures of distributions: inference and estimation. In Markov Chain Monte Carlo in Practice (W. Gilks, S. Richardson and D. Spiegelhalter, eds.) 441-464. Chapman and Hall, London. · Zbl 0849.62013
[19] Roeder, K. (1992). Semiparametric estimation of normal mixture densities. Ann. Statist. 20 929- 943. · Zbl 0746.62044 · doi:10.1214/aos/1176348664
[20] Roeder, K. and Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. J. Amer. Statist. Assoc. 92 894-902. JSTOR: · Zbl 0889.62021 · doi:10.2307/2965553
[21] Rudin, W. (1987). Real and ComplexAnalysis, 3rd ed. McGraw-Hill, New York. · Zbl 0925.00005
[22] Schwartz, L. (1965). On Bayes procedures. Z. Wahrsch. Verw. Gebiete 4 10-26. · Zbl 0158.17606 · doi:10.1007/BF00535479
[23] Shen, X. and Wasserman, L. (2001). Rates of convergence of posterior distributions. Ann. Statist. 29 687-714. · Zbl 1041.62022 · doi:10.1214/aos/1009210686
[24] Shen, X. and Wong, W. H. (1994). Convergence rate of sieve estimates. Ann. Statist. 22 580-615. · Zbl 0805.62008 · doi:10.1214/aos/1176325486
[25] van de Geer, S. (1993). Hellinger consistency of certain nonparametric maximum likelihood estimators. Ann. Statist. 21 14-44. · Zbl 0779.62033 · doi:10.1214/aos/1176349013
[26] van de Geer, S. (1996). Rates of convergence for the maximum likelihood estimator in mixture models. J. Nonparametr. Statist. 6 293-310. · Zbl 0872.62039 · doi:10.1080/10485259608832677
[27] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and empirical Processes. Springer, New York. · Zbl 0862.60002
[28] Wasserman, L. (1998). Asymptotic properties of nonparametric Bayesian procedures. Practical Nonparametric and Semiparametric Bayesian Statistics. Lecture Notes in Statist. 133 293-304. Springer, New York. · Zbl 0918.62045
[29] West, M. (1992). Modeling with mixtures. In Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 503-524. Oxford Univ. Press.
[30] West, M., Muller, P. and Escobar, M. D. (1994). Hierarchical priors and mixture models, with applications in regression and density estimation. In Aspects of Uncertainty: A Tribute to D. V. Lindley (P. R. Freeman and A. F. M. Smith, eds.) 363-386. Wiley, New York. · Zbl 0842.62001
[31] Wong, W. H. and Shen, X. (1995). Probability inequalities for likelihood ratios and convergence rates of sieve MLEs. Ann. Statist. 23 339-362. · Zbl 0829.62002 · doi:10.1214/aos/1176324524
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.