×

zbMATH — the first resource for mathematics

Robust Bayesian model selection for variable clustering with the Gaussian graphical model. (English) Zbl 1436.62217
Summary: Variable clustering is important for explanatory analysis. However, only few dedicated methods for variable clustering with the Gaussian graphical model have been proposed. Even more severe, small insignificant partial correlations due to noise can dramatically change the clustering result when evaluating for example with the Bayesian information criteria (BIC). In this work, we try to address this issue by proposing a Bayesian model that accounts for negligible small, but not necessarily zero, partial correlations. Based on our model, we propose to evaluate a variable clustering result using the marginal likelihood. To address the intractable calculation of the marginal likelihood, we propose two solutions: one based on a variational approximation and another based on MCMC. Experiments on simulated data show that the proposed method is similarly accurate as BIC in the no noise setting, but considerably more accurate when there are noisy partial correlations. Furthermore, on real data the proposed method provides clustering results that are intuitively sensible, which is not always the case when using BIC or its extensions.
MSC:
62H22 Probabilistic graphical models
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62B10 Statistical aspects of information-theoretic topics
Software:
ADVI; glasso; praxis; SSS; Stan
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Akaike, H.; Parzen, Ekg; Tanabe, K., Information theory and an extension of the maximum likelihood principle, Reprint in Breakthroughs in Statistics, 1992, 610-624 (1973), New York: Springer, New York
[2] Albersts, B.; Johnson, A.; Lewis, J.; Morgan, D.; Raff, M.; Roberts, K.; Walter, P., Molecular Biology of the Cell: The Problems Book (2014), New York: Garland Science, New York
[3] Anderson, Tw, An Introduction to Multivariate Statistical Analysis (2004), New York: Wiley, New York
[4] Belkin, M.; Niyogi, P., Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput., 15, 6, 1373-1396 (2003) · Zbl 1085.68119
[5] Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J., Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn., 3, 1, 1-122 (2011) · Zbl 1229.90122
[6] Brent, R.P.: Algorithms for finding zeros and extrema of functions without calculating derivatives. Technical report, Stanford University, Department of Computer Science (1971)
[7] Brooks, Sp; Gelman, A., General methods for monitoring convergence of iterative simulations, J. Comput. Graph. Stat., 7, 4, 434-455 (1998)
[8] Caliński, T.; Harabasz, J., A dendrite method for cluster analysis, Commun. Stat. Theory Methods, 3, 1, 1-27 (1974) · Zbl 0273.62010
[9] Carpenter, B.; Gelman, A.; Hoffman, Md; Lee, D.; Goodrich, B.; Betancourt, M.; Brubaker, M.; Guo, J.; Li, P.; Riddell, A., Stan: a probabilistic programming language, J. Stat. Softw., 76, 1, 1-32 (2017)
[10] Chen, J.; Chen, Z., Extended Bayesian information criteria for model selection with large model spaces, Biometrika, 95, 3, 759-771 (2008) · Zbl 1437.62415
[11] Chib, S., Marginal likelihood from the Gibbs output, J. Am. Stat. Assoc., 90, 432, 1313-1321 (1995) · Zbl 0868.62027
[12] Chib, S.; Jeliazkov, I., Marginal likelihood from the Metropolis-Hastings output, J. Am. Stat. Assoc., 96, 453, 270-281 (2001) · Zbl 1015.62020
[13] Devijver, E.; Gallopin, M., Block-diagonal covariance selection for high-dimensional Gaussian graphical models, J. Am. Stat. Assoc., 113, 521, 306-314 (2018) · Zbl 1398.62020
[14] Foygel, R.; Drton, M.; Lafferty, J.; Williams, C.; Shawe-Taylor, J.; Zemel, R.; Culotta, A., Extended Bayesian information criteria for Gaussian graphical models, Advances in Neural Information Processing Systems, 604-612 (2010), New York: Springer, New York
[15] Friedman, J.; Hastie, T.; Tibshirani, R., Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 3, 432-441 (2008) · Zbl 1143.62076
[16] Hans, C.; Dobra, A.; West, M., Shotgun stochastic search for “large p” regression, J. Am. Stat. Assoc., 102, 478, 507-516 (2007) · Zbl 1134.62398
[17] Hirose, K.; Fujisawa, H.; Sese, J., Robust sparse Gaussian graphical modeling, J. Multivar. Anal., 161, 172-190 (2017) · Zbl 1373.62253
[18] Hosseini, Smj; Lee, Si; Lee, D.; Sugiyama, M.; Luxburg, U.; Guyon, I.; Garnett, R., Learning sparse Gaussian graphical models with overlapping blocks, Advances in Neural Information Processing Systems, 3801-3809 (2016), Cambridge: MIT Press, Cambridge
[19] Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
[20] Konishi, S.; Ando, T.; Imoto, S., Bayesian information criteria and smoothing parameter selection in radial basis function networks, Biometrika, 91, 1, 27-43 (2004) · Zbl 1132.62313
[21] Kucukelbir, A.; Tran, D.; Ranganath, R.; Gelman, A.; Blei, Dm, Automatic differentiation variational inference, J. Mach. Learn. Res., 18, 1, 430-474 (2017)
[22] Ledoit, O.; Wolf, M., A well-conditioned estimator for large-dimensional covariance matrices, J. Multivar. Anal., 88, 2, 365-411 (2004) · Zbl 1032.62050
[23] Lenkoski, A.; Dobra, A., Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior, J. Comput. Graph. Stat., 20, 1, 140-157 (2011)
[24] Lin, T.; Ma, S.; Zhang, S., Global convergence of unmodified 3-block ADMM for a class of convex minimization problems, J. Sci. Comput., 76, 1, 69-88 (2018) · Zbl 1415.65140
[25] Marlin, B.M., Murphy, K.P.: Sparse Gaussian graphical models with unknown block structure. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 705-712. ACM (2009)
[26] Marlin, B.M., Schmidt, M., Murphy, K.P.: Group sparse priors for covariance estimation. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 383-392. AUAI Press (2009)
[27] Ng, Ay; Jordan, Mi; Weiss, Y., Others: on spectral clustering: analysis and an algorithm, Adv. Neural Inf. Process. Syst., 2, 849-856 (2002)
[28] Palla, K.; Ghahramani, Z.; Knowles, Da; Pereira, F.; Burges, C.; Bottou, L.; Weinberger, K., A nonparametric variable clustering model, Advances in Neural Information Processing Systems, 2987-2995 (2012), Cambridge: MIT Press, Cambridge
[29] Ranganath, R.; Gerrish, S.; Blei, D.; Kaski, S.; Corander, J., Black box variational inference, Artificial Intelligence and Statistics, 814-822 (2014), New York: Springer, New York
[30] Schwarz, G., Estimating the dimension of a model, Ann. Stat., 6, 2, 461-464 (1978) · Zbl 0379.62005
[31] Scott, Jg; Carvalho, Cm, Feature-inclusion stochastic search for Gaussian graphical models, J. Comput. Graph. Stat., 17, 4, 790-808 (2008)
[32] Sun, S., Zhu, Y., Xu, J.: Adaptive variable clustering in Gaussian graphical models. In: AISTATS, pp. 931-939 (2014)
[33] Sun, S., Wang, H., Xu, J.: Inferring block structure of graphical models in exponential families. In: AISTATS (2015)
[34] Tan, Km; Witten, D.; Shojaie, A., The cluster graphical lasso for improved estimation of Gaussian graphical models, Comput. Stat. Data Anal., 85, 23-36 (2015) · Zbl 06984152
[35] Vinh, Nx; Epps, J.; Bailey, J., Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance, J. Mach. Learn. Res., 11, Oct, 2837-2854 (2010) · Zbl 1242.62062
[36] Von Luxburg, U., A tutorial on spectral clustering, Stat. Comput., 17, 4, 395-416 (2007)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.