zbMATH — the first resource for mathematics

Consistency of a recursive estimate of mixing distributions. (English) Zbl 1173.62020
Summary: Mixture models have received considerable attention recently and M. A. Newton [Sankhyā, Ser. A 64, 306–322 (2002)] proposed a fast recursive algorithm for estimating a mixing distribution. We prove almost sure consistency of this recursive estimate in the weak topology under mild conditions on the family of densities being mixed. This recursive estimate depends on the data ordering and a permutation-invariant modification is proposed, which is an average of the original over permutations of the data sequence. A Rao-Blackwell argument is used to prove consistency in probability of this alternative estimate. Several simulations are presented, comparing the finite-sample performance of the recursive estimate and a Monte Carlo approximation to the permutation-invariant alternative along with that of the nonparametric maximum likelihood estimate and a nonparametric Bayes estimate.

62G07 Density estimation
62G20 Asymptotic properties of nonparametric inference
62F15 Bayesian inference
62L12 Sequential estimation
65C60 Computational problems in statistics (MSC2010)
65C05 Monte Carlo methods
62G05 Nonparametric estimation
Full Text: DOI arXiv
[1] Allison, D. B., Gadbury, G. L., Heo, M., Fernández, J. R., Lee, C.-K., Prolla, T. A. and Weindruch, R. (2002). A mixture model approach for the analysis of microarray gene expression data. Comput. Statist. Data Anal. 39 1-20. · Zbl 1119.62371 · doi:10.1016/S0167-9473(01)00046-9
[2] Barron, A., Schervish, M. J. and Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems. Ann. Statist. 27 536-561. · Zbl 0980.62039 · doi:10.1214/aos/1018031206
[3] Bogdan, M., Ghosh, J. K. and Tokdar, S. (2008). A comparison of the Benjamini-Hochberg procedure with some Bayesian rules for multiple testing. In Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen (N. Balakrishnan, E. Peña and M. Silvapulle, eds.) 211-230. IMS, Beachwood, OH. · doi:10.1214/193940307000000158
[4] Clyde, M. A. and George, E. I. (1999). Empirical Bayes estimation in wavelet nonparametric regression. In Bayesian Inference in Wavelet-Based Models. Lecture Notes in Statist. 141 309-322. Springer, New York. · Zbl 0936.62008 · doi:10.1007/978-1-4612-0567-8_19
[5] Cootes, T. and Taylor, C. (1999). A mixture model for representing shape variation. Comput. Imaging Vision 17 567-573.
[6] Durrett, R. (1996). Probability: Theory and Examples , 2nd ed. Duxbury Press, Belmont, CA. · Zbl 1202.60001
[7] Efron, B. (2004). Large-scale simultaneous hypothesis testing: the choice of a null hypothesis. J. Amer. Statist. Assoc. 99 96-104. · Zbl 1089.62502 · doi:10.1198/016214504000000089
[8] Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577-588. · Zbl 0826.62021 · doi:10.2307/2291069
[9] Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. 19 1257-1272. · Zbl 0729.62033 · doi:10.1214/aos/1176348248
[10] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209-230. · Zbl 0255.62037 · doi:10.1214/aos/1176342360
[11] Ghosal, S., Ghosh, J. K. and Ramamoorthi, R. V. (1999). Posterior consistency of Dirichlet mixtures in density estimation. Ann. Statist. 27 143-158. · Zbl 0932.62043 · doi:10.1214/aos/1018031105
[12] Ghosh, J. K. and Tokdar, S. T. (2006). Convergence and consistency of Newton’s algorithm for estimating mixing distribution. In Frontiers in Statistics 429-443. Imp. Coll. Press, London. · Zbl 1119.62020 · doi:10.1142/9781860948886_0019
[13] Johnstone, I. M. and Silverman, B. W. (1990). Speed of estimation in positron emission tomography and related inverse problems. Ann. Statist. 18 251-280. · Zbl 0699.62043 · doi:10.1214/aos/1176347500
[14] Kiefer, J. and Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann. Math. Statist. 27 887-906. · Zbl 0073.14701 · doi:10.1214/aoms/1177728066
[15] Kushner, H. J. and Yin, G. G. (2003). Stochastic Approximation and Recursive Algorithms and Applications , 2nd ed. Springer, New York. · Zbl 1026.62084
[16] Laird, N. (1978). Nonparametric maximum likelihood estimation of a mixed distribution. J. Amer. Statist. Assoc. 73 805-811. · Zbl 0391.62029 · doi:10.2307/2286284
[17] Leroux, B. G. (1992). Consistent estimation of a mixing distribution. Ann. Statist. 20 1350-1360. · Zbl 0763.62015 · doi:10.1214/aos/1176348772
[18] Lindsay, B. G. (1983). The geometry of mixture likelihoods: A general theory. Ann. Statist. 11 86-94. · Zbl 0512.62005 · doi:10.1214/aos/1176346059
[19] Liu, J. S. (1996). Nonparametric hierarchical Bayes via sequential imputations. Ann. Statist. 24 911-930. · Zbl 0880.62038 · doi:10.1214/aos/1032526949
[20] Martin, R. and Ghosh, J. K. (2009). Stochastic approximation and Newton’s estimate of a mixing distribution. Statist. Sci. 23 365-382. · Zbl 1329.62361 · doi:10.1214/08-STS265
[21] McLachlan, G., Bean, R. and Peel, D. (2002). A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18 413-422.
[22] Newton, M. A. (2002). On a nonparametric recursive estimator of the mixing distribution. Sankhyā Ser. A 64 306-322. · Zbl 1192.62110 · sankhya.isical.ac.in
[23] Newton, M. A., Quintana, F. A. and Zhang, Y. (1998). Nonparametric Bayes methods using predictive updating. In Practical Nonparametric and Semiparametric Bayesian Statistics. Lecture Notes in Statist. 133 45-61. Springer, New York. · Zbl 0918.62030 · doi:10.1007/978-1-4612-1732-9_3
[24] Newton, M. A. and Zhang, Y. (1999). A recursive algorithm for nonparametric analysis with missing data. Biometrika 86 15-26. · Zbl 0917.62045 · doi:10.1093/biomet/86.1.15
[25] Pan, W., Lin, J. and Le, C. (2003). A mixture model approach to detecting differentially expressed genes with microarray data. Funct. Integr. Genom. 3 117-124.
[26] Quintana, F. A. and Newton, M. A. (2000). Computational aspects of nonparametric Bayesian analysis with applications to the modeling of multiple binary sequences. J. Comput. Graph. Statist. 9 711-737.
[27] Reynolds, D., Quatieri, T. and Dunn, R. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Process . 10 19-41.
[28] Robbins, H. (1964). The empirical Bayes approach to statistical decision problems. Ann. Math. Statist. 35 1-20. · Zbl 0138.12304 · doi:10.1214/aoms/1177703729
[29] Scott, J. G. and Berger, J. O. (2006). An exploration of aspects of Bayesian multiple testing. J. Statist. Plann. Inference 136 2144-2162. · Zbl 1087.62039 · doi:10.1016/j.jspi.2005.08.031
[30] Stefanski, L. and Carroll, R. J. (1990). Deconvoluting kernel density estimators. Statistics 21 169-184. · Zbl 0697.62035 · doi:10.1080/02331889008802238
[31] Tang, Y., Ghosal, S. and Roy, A. (2007). Nonparametric Bayesian estimation of positive false discovery rates. Biometrics 63 1126-1134. · Zbl 1141.62091 · doi:10.1111/j.1541-0420.2007.00819.x
[32] Tanner, M. A. and Wong, W. H. (1987). The calculation of posterior distributions by data augmentation. J. Amer. Statist. Assoc. 82 528-550. · Zbl 0619.62029 · doi:10.2307/2289457
[33] Teicher, H. (1961). Identifiability of mixtures. Ann. Math. Statist. 32 244-248. · Zbl 0146.39302 · doi:10.1214/aoms/1177705155
[34] Wang, Y. (2007). On fast computation of the non-parametric maximum likelihood estimate of a mixing distribution. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 185-198. · Zbl 1120.62022 · doi:10.1111/j.1467-9868.2007.00583.x
[35] Zhang, C.-H. (1990). Fourier methods for estimating mixing densities and distributions. Ann. Statist. 18 806-831. · Zbl 0778.62037 · doi:10.1214/aos/1176347627
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.