×

Modeling with normalized random measure mixture models. (English) Zbl 1331.62120

Summary: The Dirichlet process mixture model and more general mixtures based on discrete random probability measures have been shown to be flexible and accurate models for density estimation and clustering. The goal of this paper is to illustrate the use of normalized random measures as mixing measures in nonparametric hierarchical mixture models and point out how possible computational issues can be successfully addressed. To this end, we first provide a concise and accessible introduction to normalized random measures with independent increments. Then, we explain in detail a particular way of sampling from the posterior using the Ferguson-Klass representation. We develop a thorough comparative analysis for location-scale mixtures that considers a set of alternatives for the mixture kernel and for the nonparametric component. Simulation results indicate that normalized random measure mixtures potentially represent a valid default choice for density estimation problems. As a byproduct of this study an {\mathsf R} package to fit these models was produced and is available in the Comprehensive R Archive Network (CRAN).

MSC:

62F15 Bayesian inference
60G57 Random measures
62G05 Nonparametric estimation
62G07 Density estimation

Software:

CRAN; BNPdensity; R
PDF BibTeX XML Cite
Full Text: DOI arXiv Euclid

References:

[1] Argiento, R., Guglielmi, A. and Pievatolo, A. (2010). Bayesian density estimation and model selection using nonparametric hierarchical mixtures. Comput. Statist. Data Anal. 54 816-832. · Zbl 1464.62019
[2] Berry, D. A. and Christensen, R. (1979). Empirical Bayes estimation of a binomial parameter via mixtures of Dirichlet processes. Ann. Statist. 7 558-568. · Zbl 0407.62018
[3] Blackwell, D. (1973). Discreteness of Ferguson selections. Ann. Statist. 1 356-358. · Zbl 0276.62009
[4] Brix, A. (1999). Generalized gamma measures and shot-noise Cox processes. Adv. in Appl. Probab. 31 929-953. · Zbl 0957.60055
[5] Burden, R. L. and Faires, J. D. (1993). Numerical Analysis . PWS Publishing Company, Boston. · Zbl 0788.65001
[6] Bush, C. A. and MacEachern, S. N. (1996). A semiparametric Bayesian model for randomised block designs. Biometrika 83 275-285. · Zbl 0864.62052
[7] Daley, D. J. and Vere-Jones, D. (2008). An Introduction to the Theory of Point Processes. Vol. II , General Theory and Structure , 2nd ed. Springer, New York. · Zbl 1159.60003
[8] Damien, P., Wakefield, J. and Walker, S. (1999). Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 331-344. · Zbl 0913.62028
[9] De Iorio, M., Müller, P., Rosner, G. L. and MacEachern, S. N. (2004). An ANOVA model for dependent random measures. J. Amer. Statist. Assoc. 99 205-215. · Zbl 1089.62513
[10] Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577-588. · Zbl 0826.62021
[11] Favaro, S., Lijoi, A. and Prünster, I. (2012). On the stick-breaking representation of normalized inverse Gaussian priors. Biometrika 99 663-674. · Zbl 1437.62455
[12] Favaro, S. and Teh, Y. W. (2013). MCMC for normalized random measure mixture models. Statist. Sci. 28 335-359. · Zbl 1331.62138
[13] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209-230. · Zbl 0255.62037
[14] Ferguson, T. S. (1974). Prior distributions on spaces of probability measures. Ann. Statist. 2 615-629. · Zbl 0286.62008
[15] Ferguson, T. S. (1983). Bayesian density estimation by mixtures of normal distributions. In Recent Advances in Statistics 287-302. Academic Press, New York. · Zbl 0557.62030
[16] Ferguson, T. S. and Klass, M. J. (1972). A representation of independent increment processes without Gaussian components. Ann. Math. Statist. 43 1634-1643. · Zbl 0254.60050
[17] Gelfand, A. E., Dey, D. K. and Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods. In Bayesian Statistics 4 147-167. Oxford Univ. Press, New York.
[18] Ishwaran, H. and James, L. F. (2001). Gibbs sampling methods for stick-breaking priors. J. Amer. Statist. Assoc. 96 161-173. · Zbl 1014.62006
[19] James, L. F., Lijoi, A. and Prünster, I. (2006). Conjugacy as a distinctive feature of the Dirichlet process. Scand. J. Stat. 33 105-120. · Zbl 1121.62028
[20] James, L. F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measures with independent increments. Scand. J. Stat. 36 76-97. · Zbl 1190.62052
[21] Kingman, J. F. C. (1967). Completely random measures. Pacific J. Math. 21 59-78. · Zbl 0155.23503
[22] Kingman, J. F. C. (1975). Random discrete distribution. J. Roy. Statist. Soc. Ser. B 37 1-22. · Zbl 0331.62019
[23] Kingman, J. F. C. (1993). Poisson Processes. Oxford Studies in Probability 3 . Oxford Univ. Press, New York. · Zbl 0771.60001
[24] Lijoi, A., Mena, R. H. and Prünster, I. (2005). Hierarchical mixture modeling with normalized inverse-Gaussian priors. J. Amer. Statist. Assoc. 100 1278-1291. · Zbl 1117.62386
[25] Lijoi, A., Mena, R. H. and Prünster, I. (2007). Controlling the reinforcement in Bayesian non-parametric mixture models. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 715-740.
[26] Lijoi, A. and Prünster, I. (2010). Models beyond the Dirichlet process. In Bayesian Nonparametrics (N. L. Hjort, C. C. Holmes, P. Müller and S. G. Walker, eds.) 80-136. Cambridge Univ. Press, Cambridge.
[27] Lijoi, A., Nipoti, B. and Prünster, I. (2013). Bayesian inference with dependent normalized completely random measures. Bernoulli . · Zbl 1309.60048
[28] Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates. I. Density estimates. Ann. Statist. 12 351-357. · Zbl 0557.62036
[29] MacEachern, S. N. and Müller, P. (1998). Estimating mixtures of Dirichlet process models. J. Comput. Graph. Statist. 7 223-238.
[30] MacEachern, S. and Müller, P. (2000). Efficient MCMC schemes for robust model extensions using encompassing Dirichlet process mixture models. In Robust Bayesian Analysis. Lecture Notes in Statist. 152 295-315. Springer, New York. · Zbl 1281.62070
[31] Marron, J. S. and Wand, M. P. (1992). Exact mean integrated squared error. Ann. Statist. 20 712-736. · Zbl 0746.62040
[32] Muliere, P. and Tardella, L. (1998). Approximating distributions of random functionals of Ferguson-Dirichlet priors. Canad. J. Statist. 26 283-297. · Zbl 0913.62010
[33] Müller, P. and Quintana, F. A. (2004). Nonparametric Bayesian data analysis. Statist. Sci. 19 95-110. · Zbl 1057.62032
[34] Müller, P. and Vidakovic, B. (1998). Bayesian inference with wavelets: Density estimation. J. Comput. Graph. Statist. 7 456-468.
[35] Nieto-Barajas, L. E., Prünster, I. and Walker, S. G. (2004). Normalized random measures driven by increasing additive processes. Ann. Statist. 32 2343-2360. · Zbl 1069.62029
[36] Nieto-Barajas, L. E. and Prünster, I. (2009). A sensitivity analysis for Bayesian nonparametric density estimators. Statist. Sinica 19 685-705. · Zbl 1168.62033
[37] Orbanz, P. and Williamson, S. (2011). Unit-rate Poisson representations of completely random measures. Technical report.
[38] Papaspiliopoulos, O. and Roberts, G. O. (2008). Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95 169-186. · Zbl 1437.62576
[39] Pitman, J. (2003). Poisson-Kingman partitions. In Statistics and Science : A Festschrift for Terry Speed (D. R. Goldstein, ed.). Institute of Mathematical Statistics Lecture Notes-Monograph Series 40 1-34. IMS, Beachwood, OH.
[40] Regazzini, E., Lijoi, A. and Prünster, I. (2003). Distributional results for means of normalized random measures with independent increments. Ann. Statist. 31 560-585. · Zbl 1068.62034
[41] Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixtures with an unknown number of components. J. Roy. Statist. Soc. Ser. B 59 731-792. · Zbl 0891.62020
[42] Roeder, K. and Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. J. Amer. Statist. Assoc. 92 894-902. · Zbl 0889.62021
[43] Sato, K. (1990). Lévy Processes and Infinitely Divisible Distributions . Cambridge Univ. Press, Cambridge.
[44] Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statist. Sinica 4 639-650. · Zbl 0823.62007
[45] Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis . Chapman & Hall, London. · Zbl 0617.62042
[46] Tierney, L. (1994). Markov chains for exploring posterior distributions. Ann. Statist. 22 1701-1762. · Zbl 0829.62080
[47] Walker, S. G. (2007). Sampling the Dirichlet mixture model with slices. Comm. Statist. Simulation Comput. 36 45-54. · Zbl 1113.62058
[48] Walker, S. and Damien, P. (2000). Representations of Lévy processes without Gaussian components. Biometrika 87 477-483. · Zbl 1072.60503
[49] Wolpert, R. L. and Ickstadt, K. (1998). Poisson/gamma random field models for spatial statistics. Biometrika 85 251-267. · Zbl 0951.62082
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.