×

Adjusting background noise in cluster analyses of longitudinal data. (English) Zbl 1466.62085

Summary: Background noise in cluster analyses can potentially mask the true underlying patterns. To tease out patterns uniquely to certain populations, a Bayesian semi-parametric clustering method is presented. It infers and adjusts background noise. The method is built upon a mixture of the Dirichlet process and a point mass function. Simulations demonstrate the effectiveness of the proposed method. The method is then applied to analyze a longitudinal data set on allergic sensitization and asthma status.

MSC:

62-08 Computational methods for problems pertaining to statistics

Software:

KmL; BayesDA
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Antoniak, C. E., Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems, Ann. Statist., 2, 1152-1174, (1974) · Zbl 0335.60034
[2] Arshad, S. H.; Hide, D. W., Effect of environmental factors on the development of allergic disorders in infancy, J. Allergy Clin. Immunol., 90, 235-241, (1992)
[3] Baladandayuthapani, V.; Mallick, B. K.; Carroll, R. J., Spatially adaptive Bayesian penalized regression splines (p-splines), J. Comput. Graph. Statist., 14, 378-394, (2005)
[4] Bigelow, J. L.; Dunson, D. B., Bayesian semiparametric joint models for functional predictors, J. Amer. Statist. Assoc., 104, 26-36, (2009) · Zbl 1388.62181
[5] Caron, F.; Teh, Y. W.; Murphy, T. B., Bayesian nonparametric plackett-luce models for the analysis of preferences for college degree programmes, Ann. Appl. Stat., 8, 1145-1181, (2014) · Zbl 1454.62153
[6] Dahl, D. B., Model-based clustering for expression data via a Dirichlet process mixture model, (Do, Kim-Anh; Müller, Peter; Vannucci, Marina, Bayesian Inference for Gene Expression and Proteomics, (2006), Cambridge University Press)
[7] Dorazio, R. M., On selecting a prior for the precision parameter of Dirichlet process mixture models, J. Statist. Plann. Inference, 139, 3384-3390, (2009) · Zbl 1168.62022
[8] Dunson, D. B.; Herring, A. H.; Engel, S. M., Bayesian selection and clustering of polymorphisms in functionally related genes, J. Amer. Statist. Assoc., 103, 534-546, (2008) · Zbl 1469.62367
[9] Efron, B.; Tibshirani, R.; Storey, J. D.; Tusher, V., Empirical Bayes analysis of a microarray experiment, J. Amer. Statist. Assoc., 96, 1151-1160, (2001) · Zbl 1073.62511
[10] Eilers, P. H.C.; Marx, B. D., Flexible smoothing with b-splines and penalties, Statist. Sci., 11, 89-121, (1996) · Zbl 0955.62562
[11] Escobar, M. D.; West, M., Bayesian density estimation and inference using mixtures, J. Amer. Statist. Assoc., 90, 577-588, (1995) · Zbl 0826.62021
[12] Ferguson, T. S., A Bayesian analysis of some nonparametric problems, Ann. Statist., 1, 209-230, (1973)
[13] Fraley, C.; Raftery, A. E., Model-based clustering, discriminant analysis, and density estimation, J. Amer. Statist. Assoc., 97, 611-631, (2002) · Zbl 1073.62545
[14] Gelman, A.; Carlin, J. B.; Stern, H. S.; Rubin, D. B., Bayesian data analysis, (2003), Chapman and Hall/CRC
[15] Genolini, C.; Falissard, B., Kml: A package to cluster longitudinal data, Comput. Methods Programs Biomed., 104, e112-e121, (2011)
[16] George, E. I.; McCulloch, R. E., Appraches for Bayesian variable selection, Statist. Sinica, 7, 339-373, (1997) · Zbl 0884.62031
[17] Green, P. J., Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika, 82, 4, 711-732, (1995) · Zbl 0861.62023
[18] Kim, S.; Dahl, D. B.; Vannucci, M., Spiked Dirichlet process prior for Bayesian multiple hypothesis testing in random effects models, Bayesian Anal., 4, 707-732, (2009) · Zbl 1330.62029
[19] Kim, S. Y.; Lee, J. W.; Bae, J. S., Effect of data normalization on fuzzy clustering of DNA microarray data, BMC Bioinformatics, 7, 134, (2006)
[20] McNicholas, P. D.; Murphy, T. B., Model-based clustering of longitudinal data, Canad. J. Statist., 38, 153-168, (2010) · Zbl 1190.62120
[21] Murugiah, S.; Sweeting, T., Selecting the precision parameter prior in Dirichlet process mixture models, J. Statist. Plann. Inference, 142, 1947-1959, (2012) · Zbl 1237.62011
[22] Neal, R. M., Markov chain sampling methods for Dirichlet process mixture models, J. Comput. Graph. Statist., 9, 249-265, (2000)
[23] Nieto-Barajas, L. E.; Contreras-Cristan, A., A Bayesian nonparametric approach for time series clustering, Bayesian Anal., 9, 147-170, (2014) · Zbl 1327.62473
[24] Qin, L.; Self, S., The clustering of regression models method with applications in gene expression data, Biometrics, 62, 526-533, (2006) · Zbl 1097.62134
[25] Ritter, C.; Tanner, M. A., Facilitating the Gibbs sampler: the Gibbs stopper and the griddy-Gibbs sampler, J. Amer. Statist. Assoc., 87, 861-868, (1992)
[26] Scott, J. G., Nonparametric Bayesian multiple testing for longitudinal perforance stratification, Ann. Appl. Stat., 3, 1655-1674, (2009) · Zbl 1184.62156
[27] Soto-Ramirez, N.; Arshad, S. H.; Holloway, J. W.; Zhang, H.; Schauberger, E.; Ewart, S.; Patil, V.; Karmaus, W., The interaction of genetic variants and DNA methylation of the interleukin-4 receptor gene increase the risk of asthma at age 18 years, Clin. Epigenet., 5, 1-8, (2013)
[28] Spiegelhalter, D. J.; Best, N. G.; Carlin, B. P.; van der Linde, A., The deviance information criterion: 12 years on, J. R. Stat. Soc. Ser. B Stat. Methodol., 76, 485-493, (2014)
[29] Stephens, M., Bayesian analysis of mixture models with an unknown number of components -an alternative to reversible jump methods, Ann. Statist., 28, 40-74, (2000) · Zbl 1106.62316
[30] Zhang, H.; Ghosh, K.; Ghosh, P., Sampling designs via a multivariate hypergeometric-Dirichlet process model for a multi-species assemblage with unknown heterogeneity, Comput. Statist. Data Anal., 56, 2562-2573, (2012) · Zbl 1252.62116
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.