×

An expectation conditional maximization approach for Gaussian graphical models. (English) Zbl 07499025

Summary: Bayesian graphical models are a useful tool for understanding dependence relationships among many variables, particularly in situations with external prior information. In high-dimensional settings, the space of possible graphs becomes enormous, rendering even state-of-the-art Bayesian stochastic search computationally infeasible. We propose a deterministic alternative to estimate Gaussian and Gaussian copula graphical models using an expectation conditional maximization (ECM) algorithm, extending the EM approach from Bayesian variable selection to graphical model estimation. We show that the ECM approach enables fast posterior exploration under a sequence of mixture priors, and can incorporate multiple sources of information. Supplementary materials for this article are available online.

MSC:

62-XX Statistics
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Akaike, H., Selected Papers of Hirotugu Akaike, “Information Theory and an Extension of the Maximum Likelihood Principle,”, 199-213 (1998), New York: Springer, New York
[2] Bu, Y.; Lederer, J., “Integrating Additional Knowledge Into Estimation of Graphical Models,”, arXiv no, 1704, 02739 (2017)
[3] Butts, C. T., “network: A Package for Managing Relational Data in R,”, Journal of Statistical Software, 24, 1-36 (2008) · doi:10.18637/jss.v024.i02
[4] Byass, P.; Chandramohan, D.; Clark, S. J.; D’Ambruoso, L.; Fottrell, E.; Graham, W. J.; Herbst, A. J.; Hodgson, A.; Hounton, S.; Kahn, K.; Krishnan, A., “Strengthening Standardised Interpretation of Verbal Autopsy Data: The New InterVA-4 Tool,”, Global Health Action, 5, 19281 (2012) · doi:10.3402/gha.v5i0.19281
[5] Cai, T. T.; Zhang, C. H.; Zhou, H. H., “Optimal Rates of Convergence for Covariance Matrix Estimation,”, Annals of Statistics, 38, 2118-2144 (2010) · Zbl 1202.62073 · doi:10.1214/09-AOS752
[6] Delyon, B.; Lavielle, M.; Moulines, E., “Convergence of a Stochastic Approximation Version of the EM Algorithm,”, Annals of Statistics, 27, 94-128 (1999) · Zbl 0932.62094 · doi:10.1214/aos/1018031103
[7] Deshpande, S. K.; Rockova, V.; George, E. I., “Simultaneous Variable and Covariance Selection With the Multivariate Spike-and-Slab Lasso,”, arXiv no (2017) · Zbl 07499036 · doi:10.1080/10618600.2019.1593179
[8] Dobra, A.; Lenkoski, A.; Rodriguez, A., “Bayesian Inference for General Gaussian Graphical Models With Application to Multivariate Lattice Data,”, Journal of the American Statistical Association, 106, 1418-1433 (2011) · Zbl 1234.62018 · doi:10.1198/jasa.2011.tm10465
[9] Eddelbuettel, D.; François, R., “Rcpp: Seamless R and C++ Integration,”, Journal of Statistical Software, 40, 1-18 (2011) · doi:10.18637/jss.v040.i08
[10] Fan, J.; Feng, Y.; Wu, Y., “Network Exploration via the Adaptive Lasso and Scad Penalties,”, The Annals of Applied Statistics, 3, 521 (2009) · Zbl 1166.62040 · doi:10.1214/08-AOAS215SUPP
[11] Friedman, J.; Hastie, T.; Tibshirani, R., “Sparse Inverse Covariance Estimation With the Graphical Lasso,”, Biostatistics, 9, 432-441 (2008) · Zbl 1143.62076 · doi:10.1093/biostatistics/kxm045
[12] Friedman, J.; Hastie, T.; Tibshirani, R., Applications of the Lasso and Grouped Lasso to the Estimation of Sparse Graphical Models, 1-22 (2010)
[13] Friedman, J.; Hastie, T.; Tibshirani, R., glasso: Graphical Lasso: Estimation of Gaussian Graphical Models (2018)
[14] George, E. I.; McCulloch, R. E., “Variable Selection via Gibbs Sampling,”, Journal of the American Statistical Association, 88, 881-889 (1993) · doi:10.1080/01621459.1993.10476353
[15] Hoff, P. D., “Extending the Rank Likelihood for Semiparametric Copula Estimation,”, The Annals of Applied Statistics, 1, 265-283 (2007) · Zbl 1129.62050
[16] Horton, R., “Counting for Health,”, Lancet, 370, 1526 (2007) · doi:10.1016/S0140-6736(07)61418-4
[17] Ishwaran, H.; Rao, J. S., “Detecting Differentially Expressed Genes in Microarrays Using Bayesian Model Selection,”, Journal of the American Statistical Association, 98, 438-455 (2003) · Zbl 1041.62090 · doi:10.1198/016214503000224
[18] Jones, B.; Carvalho, C.; Dobra, A.; Hans, C.; Carter, C.; West, M., “Experiments in Stochastic Computation for High-Dimensional Graphical Models,”, Statistical Science, 20, 388-400 (2005) · Zbl 1130.62408 · doi:10.1214/088342305000000304
[19] Lauritzen, S. L., Graphical Models, 17 (1996), New York: Clarendon Press, New York · Zbl 0907.62001
[20] Lenkoski, A.; Dobra, A., “Computational Aspects Related to Inference in Gaussian Graphical Models With the G-Wishart Prior,”, Journal of Computational and Graphical Statistics, 20, 140-157 (2011) · doi:10.1198/jcgs.2010.08181
[21] Levine, R. A.; Casella, G., “Implementations of the Monte Carlo EM Algorithm,”, Journal of Computational and Graphical Statistics, 10, 422-439 (2001) · doi:10.1198/106186001317115045
[22] Li, Z. R., McCormick, T., and Clark, S. (2019), “openVA: Automated Method for Verbal Autopsy,” R Package Version 1.0.8.
[23] Liu, H.; Lafferty, J.; Wasserman, L., “The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs,”, Journal of Machine Learning Research, 10, 2295-2328 (2009) · Zbl 1235.62035
[24] Liu, H.; Roeder, K.; Wasserman, L., “Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models,, in Advances in Neural Information Processing Systems, 1432-1440 (2010)
[25] Lukemire, J., Kundu, S., Pagnoni, G., and Guo, Y. (2017), “Bayesian Joint Modeling of Multiple Brain Functional Networks,” arXiv no. 1708.02123. · Zbl 1464.62460
[26] Lysen, S., 28 (2009)
[27] Mazumder, R.; Hastie, T., “The Graphical Lasso: New Insights and Alternatives,”, Electronic Journal of Statistics, 6, 2125 (2012) · Zbl 1295.62066 · doi:10.1214/12-EJS740
[28] McCormick, T. H.; Li, Z. R.; Calvert, C.; Crampin, A. C.; Kahn, K.; Clark, S. J., “Probabilistic Cause-of-Death Assignment Using Verbal Autopsies,”, Journal of the American Statistical Association, 111, 1036-1049 (2016) · doi:10.1080/01621459.2016.1152191
[29] Meinshausen, N.; Bühlmann, P., “High-Dimensional Graphs and Variable Selection With the Lasso,”, The Annals of Statistics, 34, 1436-1462 (2006) · Zbl 1113.62082 · doi:10.1214/009053606000000281
[30] Meng, X.-L.; Rubin, D. B., “Maximum Likelihood Estimation via the ECM Algorithm: A General Framework,”, Biometrika, 80, 267-278 (1993) · Zbl 0778.62022 · doi:10.1093/biomet/80.2.267
[31] Mohammadi, A.; Abegaz, F.; van den Heuvel, E.; Wit, E. C., “Bayesian Modelling of Dupuytren Disease by Using Gaussian Copula Graphical Models,”, Journal of the Royal Statistical Society, Series C, 66, 629-645 (2017) · doi:10.1111/rssc.12171
[32] Mohammadi, A.; Wit, E. C., BDgraph: An R Package for Bayesian Structure Learning in Graphical Models, arXiv no (2015)
[33] Murray, C. J.; Lopez, A. D.; Black, R.; Ahuja, R.; Ali, S. M.; Baqui, A.; Dandona, L.; Dantzer, E.; Das, V.; Dhingra, U.; Dutta, A., “Population Health Metrics Research Consortium Gold Standard Verbal Autopsy Validation Study: Design, Implementation, and Development of Analysis Datasets,”, Population Health Metrics, 9, 27 (2011) · doi:10.1186/1478-7954-9-27
[34] Nelsen, R. B., An Introduction to Copulas, Lecture Notes in Statistics, 139 (1999), New York: Springer-Verlag, New York · Zbl 0909.62052
[35] Nielsen, S. F., “The Stochastic EM Algorithm: Estimation and Asymptotic Results,”, Bernoulli, 6, 457-489 (2000) · Zbl 0981.62022 · doi:10.2307/3318671
[36] Peterson, C. B.; Stingo, F. C.; Vannucci, M., “Joint Bayesian Variable and Graph Selection for Regression Models With Network-Structured Predictors,”, Statistics in Medicine, 35, 1017-1031 (2015) · doi:10.1002/sim.6792
[37] Peterson, C.; Vannucci, M.; Karakas, C.; Choi, W.; Ma, L.; Meletić-Savatić, M., “Inferring Metabolic Networks Using the Bayesian Adaptive Graphical Lasso With Informative Priors, Statistics and Its Interface, 6, 547 (2013) · Zbl 1326.92028 · doi:10.4310/SII.2013.v6.n4.a12
[38] R Core Team, R: A Language and Environment for Statistical Computing (2018), Vienna, Austria: R Foundation for Statistical Computing, Vienna, Austria
[39] Rocková, V., “Particle EM for Variable Selection,”, Journal of the American Statistical Association, 113, 1684-1697 (2018) · Zbl 1409.62055
[40] Ročková, V.; George, E. I., “EMVS: The EM Approach to Bayesian Variable Selection,”, Journal of the American Statistical Association, 109, 828-846 (2014) · Zbl 1367.62049 · doi:10.1080/01621459.2013.869223
[41] Rothman, A. J.; Bickel, P. J.; Levina, E.; Zhu, J., “Sparse Permutation Invariant Covariance Estimation,”, Electronic Journal of Statistics, 2, 494-515 (2008) · Zbl 1320.62135 · doi:10.1214/08-EJS176
[42] Roverato, A., “Hyper Inverse Wishart Distribution for Non-decomposable Graphs and Its Application to Bayesian Inference for Gaussian Graphical Models,”, Scandinavian Journal of Statistics, 29, 391-411 (2002) · Zbl 1036.62027 · doi:10.1111/1467-9469.00297
[43] Schwarz, G., “Estimating the Dimension of a Model,”, The Annals of Statistics, 6, 461-464 (1978) · Zbl 0379.62005 · doi:10.1214/aos/1176344136
[44] Serina, P.; Riley, I.; Stewart, A.; Flaxman, A. D.; Lozano, R.; Mooney, M. D.; Luning, R.; Hernandez, B.; Black, R.; Ahuja, R.; Alam, N., “A Shortened Verbal Autopsy Instrument for Use in Routine Mortality Surveillance Systems,”, BMC Medicine, 13, 1 (2015) · doi:10.1186/s12916-015-0528-8
[45] Sing, T.; Sander, O.; Beerenwinkel, N.; Lengauer, T., “ROCR: Visualizing Classifier Performance in R,”, Bioinformatics, 21, 7881 (2005) · doi:10.1093/bioinformatics/bti623
[46] Wakefield, J.; De Vocht, F.; Hung, R. J., “Bayesian Mixture Modeling of Gene-Environment and Gene-Gene Interactions,”, Genetic Epidemiology, 34, 16-25 (2010) · doi:10.1002/gepi.20429
[47] Wang, H., “Bayesian Graphical Lasso Models and Efficient Posterior Computation,”, Bayesian Analysis, 7, 867-886 (2012) · Zbl 1330.62041 · doi:10.1214/12-BA729
[48] Wang, H., “Scaling It Up: Stochastic Search Structure Learning in Graphical Models,”, Bayesian Analysis, 10, 351-377 (2015) · Zbl 1335.62068
[49] Wang, H.; Li, S. Z., “Efficient Gaussian Graphical Model Determination Under G-Wishart Prior Distributions,”, Electronic Journal of Statistics, 6, 168-198 (2012) · Zbl 1335.62069 · doi:10.1214/12-EJS669
[50] Wei, G. C.; Tanner, M. A., “A Monte Carlo Implementation of the EM Algorithm and the Poor Man’s Data Augmentation Algorithms,”, Journal of the American statistical Association, 85, 699-704 (1990) · doi:10.1080/01621459.1990.10474930
[51] Wei, T.; Simko, V., R Package ‘corrplot’: Visualization of a Correlation Matrix, Version (2017)
[52] Wickham, H., “Reshaping Data With the Reshape Package,”, Journal of Statistical Software, 21, 1-20 (2007) · doi:10.18637/jss.v021.i12
[53] Wickham, H., ggplot2: Elegant Graphics for Data Analysis (2016), New York: Springer-Verlag, New York · Zbl 1397.62006
[54] Wilhelm, S.; Manjunath, B. G., “tmvtnorm: Truncated Multivariate Normal and Student t Distribution,”, R Package Version, 1.4-10 (2015)
[55] Witten, D. M.; Friedman, J. H.; Simon, N., “New Insights and Faster Computations for the Graphical Lasso,”, Journal of Computational and Graphical Statistics, 20, 892-900 (2011) · doi:10.1198/jcgs.2011.11051a
[56] Xue, L., “Regularized Learning of High-Dimensional Sparse Graphical Models (2012)
[57] Xue, L.; Zou, H., “Regularized Rank-Based Estimation of High-Dimensional Nonparanormal Graphical Models,”, The Annals of Statistics, 40, 2541-2571 (2012) · Zbl 1373.62138 · doi:10.1214/12-AOS1041
[58] Yin, J.; Li, H., “A Sparse Conditional Gaussian Graphical Model for Analysis of Genetical Genomics Data,”, The Annals of Applied Statistics, 5, 2630 (2011) · Zbl 1234.62151 · doi:10.1214/11-AOAS494
[59] Yuan, M.; Lin, Y., “Model Selection and Estimation in the Gaussian Graphical Model,”, Biometrika, 94, 19-35 (2007) · Zbl 1142.62408 · doi:10.1093/biomet/asm018
[60] Zhao, T.; Liu, H.; Roeder, K.; Lafferty, J.; Wasserman, L., “The Huge Package for High-Dimensional Undirected Graph Estimation in R,”, Journal of Machine Learning Research, 13, 1059-1062 (2012) · Zbl 1283.68311
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.