##
**Latent nested nonparametric priors (with discussion).**
*(English)*
Zbl 1436.62108

Summary: Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalizing to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop a Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by-product. The results and their inferential implications are showcased on synthetic and real data.

### MSC:

62G05 | Nonparametric estimation |

60G57 | Random measures |

62H30 | Classification and discrimination; cluster analysis (statistical aspects) |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

### Keywords:

Bayesian nonparametrics; completely random measures; dependent nonparametric priors; heterogeneity; mixture models; nested processes### Software:

shallot
PDF
BibTeX
XML
Cite

\textit{F. Camerlenghi} et al., Bayesian Anal. 14, No. 4, 1303--1356 (2019; Zbl 1436.62108)

### References:

[1] | Barrientos, A. F., Jara, A., and Quintana, F. A. (2017). “Fully nonparametric regression for bounded data using dependent Bernstein polynomials.” Journal of the American Statistical Association, to appear. |

[2] | Bhattacharya, A. and Dunson, D. (2012). “Nonparametric Bayes classification and hypothesis testing on manifolds.” Journal of Multivariate Analysis, 111: 1-19. · Zbl 1281.62033 |

[3] | Blei, D. M. and Frazier, P. I. (2011). “Distance dependent Chinese restaurant process.” Journal of Machine Learning Research, 12: 2383-2410. · Zbl 1280.68157 |

[4] | Blei, D. M., NG, A. Y., and Jordan, M. I. (2003). “Latent Dirichlet allocation.” Journal of Machine Learning Research, 3: 993-1022. · Zbl 1112.68379 |

[5] | Camerlenghi, F., Lijoi, A., Orbanz, P., and Prünster, I. (2019a). “Distribution theory for hierarchical processes.” Annals of Statistics, 47(1): 67-92. · Zbl 1478.60151 |

[6] | Camerlenghi, F., Dunson, D. B., Lijoi, A., Prünster, I., and Rodríguez, A. (2019b). “Supplementary material to Latent nested nonparametric priors.” Bayesian Analysis. · Zbl 1436.62108 |

[7] | Chung, Y. and Dunson, D. B. (2009). “Nonparametric Bayes conditional distribution modeling with variable selection.” Journal of the American Statistical Association, 104(488): 1646-1660. · Zbl 1205.62039 |

[8] | Dahl, D. B., Day, R., and Tsai, J. W. (2017). “Random partition distribution indexed by pairwise information.” Journal of the American Statistical Association to appear. |

[9] | De Iorio, M., Johnson, W. O., Müller, P., and Rosner, G. L. (2009). “Bayesian nonparametric nonproportional hazards survival modeling.” Biometrics, 65(3): 762-771. · Zbl 1172.62073 |

[10] | De Iorio, M., Müller, P., Rosner, G. L., and MacEachern, S. N. (2004). “An ANOVA model for dependent random measures.” Journal of the American Statistical Association, 99(465): 205-215. · Zbl 1089.62513 |

[11] | Filippi, S. and Holmes, C. C. (2017). “A Bayesian nonparametric approach for quantifying dependence between random variables.” Bayesian Analysis, 12(4): 919-938. · Zbl 1384.62146 |

[12] | Gelfand, A. E., Kottas, A., and MacEachern, S. N. (2005). “Bayesian nonparametric spatial modeling with Dirichlet process mixing.” Journal of the American Statistical Association, 100(471): 1021-1035. · Zbl 1117.62342 |

[13] | Griffin, J. E., Kolossiatis, M., and Steel, M. F. J. (2013). “Comparing distributions by using dependent normalized random-measure mixtures.” Journal of the Royal Statistical Society. Series B, Statistical Methodology, 75(3): 499-529. · Zbl 1411.62083 |

[14] | Griffin, J. E. and Leisen, F. (2017). “Compound random measures and their use in Bayesian non-parametrics.” Journal of the Royal Statistical Society. Series B, 79(2): 525-545. · Zbl 1412.60071 |

[15] | Griffin, J. E. and Steel, M. F. J. (2006). “Order-based dependent Dirichlet processes.” Journal of the American Statistical Association, 101(473): 179-194. · Zbl 1118.62360 |

[16] | Hjort, N. L. (2000). “Bayesian analysis for a generalized Dirichlet process prior.” Technical report, University of Oslo. |

[17] | Holmes, C., Caron, F., Griffin, J. E., and Stephens, D. A. (2015). “Two-sample Bayesian nonparametric hypothesis testing.” Bayesian Analysis, 10(2): 297-320. · Zbl 1334.62082 |

[18] | Jara, A., Lesaffre, E., De Iorio, M., and Quintana, F. (2010). “Bayesian semiparametric inference for multivariate doubly-interval-censored data.” Annals of Applied Statistics, 4(4): 2126-2149. · Zbl 1220.62023 |

[19] | Kingman, J. F. C. (1978). “The representation of partition structures.” Journal of the London Mathematical Society (2), 18(2): 374-380. · Zbl 0415.92009 |

[20] | Kingman, J. F. C. (1993). Poisson processes. Oxford University Press. · Zbl 0771.60001 |

[21] | Lijoi, A., Nipoti, B., and Prünster, I. (2014). “Bayesian inference with dependent normalized completely random measures.” Bernoulli, 20(3): 1260-1291. · Zbl 1309.60048 |

[22] | Ma, L. and Wong, W. H. (2011). “Coupling optional Pólya trees and the two sample problem.” Journal of the American Statistical Association, 106(496): 1553-1565. · Zbl 1233.62104 |

[23] | MacEachern, S. N. (1994). “Estimating normal means with a conjugate style Dirichlet process prior.” Communications in Statistics. Simulation and Computation, 23(3): 727-741. · Zbl 0825.62053 |

[24] | MacEachern, S. N. (1999). “Dependent nonparametric processes.” In ASA proceedings of the section on Bayesian statistical science, 50-55. |

[25] | MacEachern, S. N. (2000). “Dependent Dirichlet processes.” Tech. Report, Department of Statistics, The Ohio State University. |

[26] | Mena, R. H. and Ruggiero, M. (2016). “Dynamic density estimation with diffusive Dirichlet mixtures.” Bernoulli, 22(2): 901-926. · Zbl 1388.62099 |

[27] | Müller, P., Quintana, F., and Rosner, G. (2004). “A method for combining inference across related nonparametric Bayesian models.” Journal of the Royal Statistical Society. Series B, Statistical Methodology, 66(3): 735-749. · Zbl 1046.62053 |

[28] | Müller, P., Quintana, F., and Rosner, G. L. (2011). “A product partition model with regression on covariates.” Journal of Computational and Graphical Statistics, 20(1): 260-278. |

[29] | Nguyen, X. (2013). “Convergence of latent mixing measures in finite and infinite mixture models.” Annals of Statistics, 41(1): 370-400. · Zbl 1347.62117 |

[30] | Nguyen, X. (2015). “Posterior contraction of the population polytope in finite admixture models.” Bernoulli, 21(1): 618-646. · Zbl 1368.62288 |

[31] | Page, G. L. and Quintana, F. A. (2016). “Spatial product partition models.” Bayesian Analysis, 11(1): 265-298. · Zbl 1359.62401 |

[32] | Pitman, J. (1995). “Exchangeable and partially exchangeable random partitions.” Probab. Theory Related Fields, 102(2): 145-158. · Zbl 0821.60047 |

[33] | Regazzini, E., Lijoi, A., and Prünster, I. (2003). “Distributional results for means of random measures with independent increments.” Annals of Statistics, 31: 560-585. · Zbl 1068.62034 |

[34] | Rodríguez, A. and Dunson, D. B. (2011). “Nonparametric Bayesian models through probit stick-breaking processes.” Bayesian Analysis, 6(1): 145-177. · Zbl 1330.62120 |

[35] | Rodríguez, A. and Dunson, D. B. (2014). “Functional clustering in nested designs: modeling variability in reproductive epidemiology studies.” Annals of Applied Statistics, 8(3): 1416-1442. · Zbl 1303.62040 |

[36] | Rodríguez, A., Dunson, D. B., and Gelfand, A. E. (2008). “The nested Dirichlet process.” Journal of the American Statistical Association, 103(483): 1131-1144. · Zbl 1205.62062 |

[37] | Rodríguez, A., Dunson, D. B., and Gelfand, A. E. (2010). “Latent stick-breaking processes.” Journal of the American Statistical Association, 105(490): 647-659. · Zbl 1392.60050 |

[38] | Soriano, J. and Ma, L. (2017). “Probabilistic multi-resolution scanning for two-sample differences.” Journal of the Royal Statistical Society. Series B, Statistical Methodology, 79(2): 547-572. · Zbl 1414.62149 |

[39] | Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. (2006). “Hierarchical Dirichlet processes.” Journal of the American Statistical Association, 101(476): 1566-1581. · Zbl 1171.62349 |

[40] | West, M., Müller, P., and Escobar, M. D. (1994). “Hierarchical priors and mixture models, with application in regression and density estimation.” In Aspects of uncertainty, 363-386. Wiley, Chichester. · Zbl 0842.62001 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.