Bayesian structure learning in sparse Gaussian graphical models. (English) Zbl 1335.62056

Summary: Decoding complex relationships among large numbers of variables with relatively few observations is one of the crucial issues in science. One approach to this problem is Gaussian graphical modeling, which describes conditional independence of variables through the presence or absence of edges in the underlying graph. In this paper, we introduce a novel and efficient Bayesian framework for Gaussian graphical model determination which is a trans-dimensional Markov Chain Monte Carlo (MCMC) approach based on a continuous-time birth-death process. We cover the theory and computational details of the method. It is easy to implement and computationally feasible for high-dimensional graphs. We show our method outperforms alternative Bayesian approaches in terms of convergence, mixing in the graph space and computing time. Unlike frequentist approaches, it gives a principled and, in practice, sensible approach for structure learning. We illustrate the efficiency of the method on a broad range of simulated data. We then apply the method on large-scale real applications from human and mammary gland gene expression studies to show its empirical usefulness. In addition, we implemented the method in the R package BDgraph which is freely available at http://CRAN.R-project.org/package=BDgraph.


62F15 Bayesian inference
60J28 Applications of continuous-time Markov processes on discrete state spaces
62H12 Estimation in multivariate analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T05 Learning and adaptive systems in artificial intelligence
62A09 Graphical methods in statistics
65C60 Computational problems in statistics (MSC2010)
65C05 Monte Carlo methods
Full Text: DOI arXiv Euclid


[1] Abegaz, F. and Wit, E. (2013). “Sparse time series chain graphical models for reconstructing genetic networks.” Biostatistics , 14(3): 586-599.
[2] Albert, R. and Barabási, A.-L. (2002). “Statistical mechanics of complex networks.” Reviews of modern physics , 74(1): 47. · Zbl 1205.82086
[3] Atay-Kayis, A. and Massam, H. (2005). “A Monte Carlo method for computing the marginal likelihood in nondecomposable Gaussian graphical models.” Biometrika , 92(2): 317-335. · Zbl 1094.62028
[4] Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A., and Nielsen, H. (2000). “Assessing the accuracy of prediction algorithms for classification: an overview.” Bioinformatics , 16(5): 412-424.
[5] Bhadra, A. and Mallick, B. K. (2013). “Joint High-Dimensional Bayesian Variable and Covariance Selection with an Application to eQTL Analysis.” Biometrics , 69(2): 447-457. · Zbl 1274.62722
[6] Cappé, O., Robert, C., and Rydén, T. (2003). “Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 65(3): 679-700. · Zbl 1063.62133
[7] Carvalho, C. M., and Scott, J. G. (2009). “Objective Bayesian model selection in Gaussian graphical models.” Biometrika , 96(3): 497-512. · Zbl 1170.62020
[8] Chen, L., Tong, T., and Zhao, H. (2008). “Considering dependence among genes and markers for false discovery control in eQTL mapping.” Bioinformatics , 24(18): 2015-2022.
[9] Cheng, Y., Lenkoski, A., et al. (2012). “Hierarchical Gaussian graphical models: Beyond reversible jump.” Electronic Journal of Statistics , 6: 2309-2331. · Zbl 1335.62042
[10] Dahlhaus, R. and Eichler, M. (2003). “Causality and graphical models in time series analysis.” Oxford Statistical Science Series , 115-137.
[11] Dempster, A. (1972). “Covariance selection.” Biometrics , 28(1): 157-175.
[12] Dobra, A., Lenkoski, A., and Rodriguez, A. (2011a). “Bayesian inference for general Gaussian graphical models with application to multivariate lattice data.” Journal of the American Statistical Association , 106(496): 1418-1433. · Zbl 1234.62018
[13] Dobra, A., Lenkoski, A., et al. (2011b). “Copula Gaussian graphical models and their application to modeling functional disability data.” The Annals of Applied Statistics , 5(2A): 969-993. · Zbl 1232.62046
[14] Foygel, R. and Drton, M. (2010). “Extended Bayesian Information Criteria for Gaussian Graphical Models.” In Lafferty, J., Williams, C. K. I., Shawe-Taylor, J., Zemel, R., and Culotta, A. (eds.), Advances in Neural Information Processing Systems 23 , 604-612.
[15] Friedman, J., Hastie, T., and Tibshirani, R. (2008). “Sparse inverse covariance estimation with the graphical lasso.” Biostatistics , 9(3): 432-441. · Zbl 1143.62076
[16] Geyer, C. J. and Møller, J. (1994). “Simulation procedures and likelihood inference for spatial point processes.” Scandinavian Journal of Statistics , 359-373. · Zbl 0809.62089
[17] Giudici, P. and Castelo, R. (2003). “Improving Markov chain Monte Carlo model search for data mining.” Machine Learning , 50(1-2): 127-158. · Zbl 1050.68120
[18] Giudici, P. and Green, P. (1999). “Decomposable graphical Gaussian model determination.” Biometrika , 86(4): 785-801. · Zbl 0940.62019
[19] Green, P. (1995). “Reversible jump Markov chain Monte Carlo computation and Bayesian model determination.” Biometrika , 82(4): 711-732. · Zbl 0861.62023
[20] Green, P. J. (2003). “Trans-dimensional Markov chain Monte Carlo.” Oxford Statistical Science Series , 179-198.
[21] Hastie, T., Tibshirani, R., and Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction , volume 2. Springer. · Zbl 1273.62005
[22] Jones, B., Carvalho, C., Dobra, A., Hans, C., Carter, C., and West, M. (2005). “Experiments in stochastic computation for high-dimensional graphical models.” Statistical Science , 20(4): 388-400. · Zbl 1130.62408
[23] Kullback, S. and Leibler, R. A. (1951). “On information and sufficiency.” The Annals of Mathematical Statistics , 22(1): 79-86. · Zbl 0042.38403
[24] Labrie, F., Luu-The, V., Lin, S.-X., Claude, L., Simard, J., Breton, R., and Bélanger, A. (1997). “The key role of 17\(\beta\)-hydroxysteroid dehydrogenases in sex steroid biology.” Steroids , 62(1): 148-158.
[25] Lauritzen, S. (1996). Graphical models , volume 17. Oxford University Press, USA. · Zbl 0907.62001
[26] Lenkoski, A. (2013). “A direct sampler for G-Wishart variates.” Stat , 2(1): 119-128.
[27] Lenkoski, A. and Dobra, A. (2011). “Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior.” Journal of Computational and Graphical Statistics , 20(1): 140-157.
[28] Letac, G. and Massam, H. (2007). “Wishart distributions for decomposable graphs.” The Annals of Statistics , 35(3): 1278-1323. · Zbl 1194.62078
[29] Liang, F. (2010). “A double Metropolis-Hastings sampler for spatial models with intractable normalizing constants.” Journal of Statistical Computation and Simulation , 80(9): 1007-1022. · Zbl 1233.62117
[30] Liu, H., Roeder, K., and Wasserman, L. (2010). “Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models.” In Advances in Neural Information Processing Systems , 1432-1440.
[31] Meinshausen, N. and Bühlmann, P. (2006). “High-dimensional graphs and variable selection with the lasso.” The Annals of Statistics , 34(3): 1436-1462. · Zbl 1113.62082
[32] Mohammadi, A. and Wit, E. C. (2013). BDgraph: Graph estimation based on birth-death MCMC . R package version 2.10.
[33] Muirhead, R. (1982). Aspects of multivariate statistical theory , volume 42. Wiley Online Library. · Zbl 0556.62028
[34] Murray, I., Ghahramani, Z., and MacKay, D. (2012). “MCMC for doubly-intractable distributions.” arXiv
[35] Pitt, M., Chan, D., and Kohn, R. (2006). “Efficient Bayesian inference for Gaussian copula regression models.” Biometrika , 93(3): 537-554. · Zbl 1108.62027
[36] Powers, D. M. (2011). “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation.” Journal of Machine Learning Technologies , 2(1): 37-63.
[37] Preston, C. J. (1976). “Special birth-and-death processes.” Bulletin of the International Statistical Institute , 46: 371-391.
[38] Ravikumar, P., Wainwright, M. J., Lafferty, J. D., et al. (2010). “High-dimensional Ising model selection using L1-regularized logistic regression.” The Annals of Statistics , 38(3): 1287-1319. · Zbl 1189.62115
[39] Ripley, B. D. (1977). “Modelling spatial patterns.” Journal of the Royal Statistical Society. Series B (Methodological) , 172-212. · Zbl 0369.60061
[40] Roverato, A. (2002). “Hyper Inverse Wishart Distribution for Non-decomposable Graphs and its Application to Bayesian Inference for Gaussian Graphical Models.” Scandinavian Journal of Statistics , 29(3): 391-411. · Zbl 1036.62027
[41] Schmidt-Ott, K. M., Mori, K., Li, J. Y., Kalandadze, A., Cohen, D. J., Devarajan, P., and Barasch, J. (2007). “Dual action of neutrophil gelatinase-associated lipocalin.” Journal of the American Society of Nephrology , 18(2): 407-413.
[42] Scott, J. G. and Berger, J. O. (2006). “An exploration of aspects of Bayesian multiple testing.” Journal of Statistical Planning and Inference , 136(7): 2144-2162. · Zbl 1087.62039
[43] Scutari, M. (2013). “On the Prior and Posterior Distributions Used in Graphical Modelling.” Bayesian Analysis , 8(1): 1-28. · Zbl 1329.62145
[44] Stein, T., Morris, J. S., Davies, C. R., Weber-Hall, S. J., Duffy, M.-A., Heath, V. J., Bell, A. K., Ferrier, R. K., Sandilands, G. P., and Gusterson, B. A. (2004). “Involution of the mouse mammary gland is associated with an immune cascade and an acute-phase response, involving LBP, CD14 and STAT3.” Breast Cancer Research , 6(2): R75-91.
[45] Stephens, M. (2000). “Bayesian analysis of mixture models with an unknown number of components-an alternative to reversible jump methods.” Annals of Statistics , 28(1): 40-74. · Zbl 1106.62316
[46] Stranger, B. E., Nica, A. C., Forrest, M. S., Dimas, A., Bird, C. P., Beazley, C., Ingle, C. E., Dunning, M., Flicek, P., Koller, D., et al. (2007). “Population genomics of human gene expression.” Nature genetics , 39(10): 1217-1224.
[47] Wang, H. (2012). “Bayesian graphical lasso models and efficient posterior computation.” Bayesian Analysis , 7(4): 867-886. · Zbl 1330.62041
[48] - (2014). “Scaling It Up: Stochastic Search Structure Learning in Graphical Models.”
[49] Wang, H. and Li, S. (2012). “Efficient Gaussian graphical model determination under G-Wishart prior distributions.” Electronic Journal of Statistics , 6: 168-198. · Zbl 1335.62069
[50] Wang, H. and Pillai, N. S. (2013). “On a class of shrinkage priors for covariance matrix estimation.” Journal of Computational and Graphical Statistics , 22(3): 689-707.
[51] Wit, E. and McClure, J. (2004). Statistics for Microarrays: Design, Analysis and Inference . John Wiley & Sons. · Zbl 1049.62120
[52] Zhao, P. and Yu, B. (2006). “On model selection consistency of Lasso.” The Journal of Machine Learning Research , 7: 2541-2563. · Zbl 1222.62008
[53] Zhao, T., Liu, H., Roeder, K., Lafferty, J., and Wasserman, L. (2012). “The Huge Package for High-dimensional Undirected Graph Estimation in R.” The Journal of Machine Learning Research , 13(1): 1059-1062. · Zbl 1283.68311
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.