shallot swMATH ID: 26314 Software Authors: David B. Dahl; Ryan Day; Jerry W. Tsai Description: Random Partition Distribution Indexed by Pairwise Information. We propose a random partition distribution indexed by pairwise similarity information such that partitions compatible with the similarities are given more probability. The use of pairwise similarities, in the form of distances, is common in some clustering algorithms (e.g., hierarchical clustering), but we show how to use this type of information to define a prior partition distribution for flexible Bayesian modeling. A defining feature of the distribution is that it allocates probability among partitions within a given number of subsets, but it does not shift probability among sets of partitions with different numbers of subsets. Our distribution places more probability on partitions that group similar items yet keeps the total probability of partitions with a given number of subsets constant. The distribution of the number of subsets (and its moments) is available in closed-form and is not a function of the similarities. Our formulation has an explicit probability mass function (with a tractable normalizing constant) so the full suite of MCMC methods may be used for posterior inference. We compare our distribution with several existing partition distributions, showing that our formulation has attractive properties. We provide three demonstrations to highlight the features and relative performance of our distribution. Homepage: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5736154/ Source Code: https://github.com/dbdahl/shallot Related Software: R; bamboo; rscala; CRAN; MLlib; SparkR; sparklyr; sdols; Rserve; RCaller; Rcpp; rJava; Scala; Spark; Breeze; ordinalClust; ordinalForest; lmmot; glmnetcr; toOrdinal Cited in: 9 Publications all top 5 Cited by 48 Authors 2 Aliverti, Emanuele 2 Dahl, David B. 2 Dunson, David Brian 2 Paganin, Sally 2 Prünster, Igor 2 Quintana, Fernando Andrés 2 Rigon, Tommaso 2 Rodríguez, Abel 2 Russo, Massimiliano 2 Smith, Adam N. H. 1 Allenby, Greg M. 1 Antoniano-Villalobos, Isadora 1 Avalos-Pacheco, Alejandra 1 Beraha, Mario 1 Byrne, Thomas H. 1 Camerlenghi, Federico 1 Canale, Antonio 1 Casa, Alessandro 1 Culhane, Dennis P. 1 D’Angelo, Laura A. M. 1 de Blasi, Pierpaolo 1 De Vito, Roberta 1 Duan, Leo L. 1 Fop, Michael 1 Glynn, Chris 1 Guglielmi, Alessandra 1 Hennig, Christian 1 Herring, Amy H. 1 Jara, Alejandro 1 Jensen, Thomas P. 1 Leisen, Fabrizio 1 Leonard, Samuel 1 Lijoi, Antonio 1 Liu, Vera 1 Mena, Ramsés H. 1 Müller, Peter 1 Murphy, Thomas Brendan 1 Newcomb, Spencer 1 Olshan, Andrew F. 1 Page, Garritt L. 1 Riva Palacio, Alan 1 Robert, Christian P. Robert 1 Scarpa, Bruno 1 Villa, Cristiano 1 Wade, Sara 1 Warr, Richard L. 1 Wehrhahn, Claudia 1 Xifara, Tatiana all top 5 Cited in 8 Serials 2 Bayesian Analysis 1 Annals of the Institute of Statistical Mathematics 1 Journal of the American Statistical Association 1 Journal of Statistical Computation and Simulation 1 Journal of Machine Learning Research (JMLR) 1 Electronic Journal of Statistics 1 The Annals of Applied Statistics 1 Statistics and Computing Cited in 3 Fields 9 Statistics (62-XX) 1 Probability theory and stochastic processes (60-XX) 1 Numerical analysis (65-XX) Citations by Year