×

Phylogenetically informed Bayesian truncated copula graphical models for microbial association networks. (English) Zbl 1498.62202

Summary: Microorganisms play critical roles in host health. The advancement of high-throughput sequencing technology provides opportunities for a deeper understanding of microbial interactions. However, due to the technological limitations of 16S ribosomal RNA sequencing, microbiome data are zero-inflated, and a quantitative comparison of microbial abundances cannot be made across subjects. By leveraging a recent microbiome profiling technique that quantifies 16S ribosomal RNA microbial counts, we propose a novel Bayesian graphical model that incorporates microorganisms’ evolutionary history through a phylogenetic tree prior and explicitly accounts for zero inflation using the truncated Gaussian copula. Our simulation study reveals that the evolutionary information substantially improves the network estimation accuracy. We apply the proposed model to the quantitative gut microbiome data of 106 healthy subjects and identify three distinct microbial communities that are not found by existing microbial network estimation models. We further find that these communities are discriminated based on microorganisms’ ability to utilize oxygen as an energy source.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
62H22 Probabilistic graphical models
62H12 Estimation in multivariate analysis

References:

[1] Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc. 88 669-679. · Zbl 0774.62031
[2] ANBALAGAN, R., SRIKANTH, P., MANI, M., BARANI, R., SESHADRI, K. G. and JANARTHANAN, R. (2017). Next generation sequencing of oral microbiota in Type 2 diabetes mellitus prior to and after neem stick usage and correlation with serum monocyte chemoattractant-1. Diabetes Res. Clin. Pract. 130 204-210. · doi:10.1016/j.diabres.2017.06.009
[3] BANERJEE, O., EL GHAOUI, L. and D’ASPREMONT, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9 485-516. · Zbl 1225.68149
[4] CANI, P. D., VAN HUL, M., LEFORT, C., DEPOMMIER, C., RASTELLI, M. and EVERARD, A. (2019). Microbial regulation of organismal energy homeostasis. Nature Metab. 1 34-46.
[5] CHEN, Y. R., ZHENG, H. M., ZHANG, G. X., CHEN, F. L., CHEN, L. D. and YANG, Z. C. (2020a). High Oscillospira abundance indicates constipation and low BMI in the Guangdong Gut Microbiome Project. Sci. Rep. 10 1-8.
[6] CHEN, L., COLLIJ, V., JAEGER, M., VAN DEN MUNCKHOF, I. C., VILA, A. V., KURILSHIKOV, A., GACESA, R., SINHA, T., OOSTING, M. et al. (2020b). Gut microbial co-abundance networks show specificity in inflammatory bowel disease and obesity. Nat. Commun. 11 1-12.
[7] CHO, I. and BLASER, M. J. (2012). The human microbiome: At the interface of health and disease. Nat. Rev. Genet. 13 260-270.
[8] CHUNG, H. C., GAYNANOVA, I. and NI, Y. (2022). Supplement to “Phylogenetically informed Bayesian truncated copula graphical models for microbial association networks.” https://doi.org/10.1214/21-AOAS1598SUPPA, https://doi.org/10.1214/21-AOAS1598SUPPB
[9] CLEWELL, D. B. (1981). Plasmids, drug resistance, and gene transfer in the genus Streptococcus. Microbiol. Rev. 45 409-436. · doi:10.1128/mr.45.3.409-436.1981
[10] COOKE, F. J. and SLACK, M. P. E. (2017). 183—Gram-negative coccobacilli. In Infectious Diseases, 4th ed. (J. Cohen, W. G. Powderly and S. M. Opal, eds.) 1611-1627.e1. Elsevier. · doi:10.1016/B978-0-7020-6285-8.00183-0
[11] DAHL, J., VANDENBERGHE, L. and ROYCHOWDHURY, V. (2008). Covariance selection for nonchordal graphs via chordal embedding. Optim. Methods Softw. 23 501-520. · Zbl 1151.90514 · doi:10.1080/10556780802102693
[12] DOBRA, A. and LENKOSKI, A. (2011). Copula Gaussian graphical models and their application to modeling functional disability data. Ann. Appl. Stat. 5 969-993. · Zbl 1232.62046 · doi:10.1214/10-AOAS397
[13] DOBRA, A., LENKOSKI, A. and RODRIGUEZ, A. (2011). Bayesian inference for general Gaussian graphical models with application to multivariate lattice data. J. Amer. Statist. Assoc. 106 1418-1433. · Zbl 1234.62018 · doi:10.1198/jasa.2011.tm10465
[14] EGLAND, P. G., PALMER, R. J. and KOLENBRANDER, P. E. (2004). Interspecies communication in Streptococcus gordonii-Veillonella atypica biofilms: Signaling in flow conditions requires juxtaposition. Proc. Natl. Acad. Sci. USA 101 16917-16922.
[15] FAN, J., LIU, H., NING, Y. and ZOU, H. (2017). High dimensional semiparametric latent graphical model for mixed data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 405-421. · Zbl 1414.62179 · doi:10.1111/rssb.12168
[16] FISHER, K. and PHILLIPS, C. (2009). The ecology, epidemiology and virulence of Enterococcus. Microbiology 155 1749-1757. · doi:10.1099/mic.0.026385-0
[17] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432-441. · Zbl 1143.62076
[18] GARCIA-MANTRANA, I., SELMA-ROYO, M., ALCANTARA, C. and COLLADO, M. C. (2018). Shifts on gut microbiota associated to Mediterranean diet adherence and specific dietary intakes on general adult population. Front. Microbiol. 9 890. · doi:10.3389/fmicb.2018.00890
[19] Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statist. Sci. 7 457-472. · Zbl 1386.65060
[20] Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. and Rubin, D. B. (2014). Bayesian Data Analysis, 3rd ed. Texts in Statistical Science Series. CRC Press, Boca Raton, FL. · Zbl 1279.62004
[21] GLOOR, G. B., MACKLAIM, J. M., PAWLOWSKY-GLAHN, V. and EGOZCUE, J. J. (2017). Microbiome datasets are compositional: And this is not optional. Front. Microbiol. 8 2224.
[22] HOFF, P. D. (2007). Extending the rank likelihood for semiparametric copula estimation. Ann. Appl. Stat. 1 265-283. · Zbl 1129.62050 · doi:10.1214/07-AOAS107
[23] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090-1098. · Zbl 1041.62098 · doi:10.1198/016214502388618906
[24] JOHNSON, E. L., HEAVER, S. L., WALTERS, W. A. and LEY, R. E. (2017). Microbiome and metabolic disease: Revisiting the bacterial phylum Bacteroidetes. J. Mol. Med. 95 1-8. · doi:10.1007/s00109-016-1492-2
[25] KIM, M., QIE, Y., PARK, J. and KIM, C. H. (2016). Gut microbial metabolites fuel host antibody responses. Cell Host Microbe 20 202-214.
[26] Klaassen, C. A. J. and Wellner, J. A. (1997). Efficient estimation in the bivariate normal copula model: Normal margins are least favourable. Bernoulli 3 55-77. · Zbl 0877.62055 · doi:10.2307/3318652
[27] KURTZ, Z. D., MÜLLER, C. L., MIRALDI, E. R., LITTMAN, D. R., BLASER, M. J. and BONNEAU, R. A. (2015). Sparse and compositionally robust inference of microbial ecological networks. PLoS Comput. Biol. 11 e1004226. · doi:10.1371/journal.pcbi.1004226
[28] KURTZ, Z. D., MÜLLER, C. L., MIRALDI, E. R., LITTMAN, D. R., BLASER, M. J. and BONNEAU, R. A. (2021). SpiecEasi: Sparse inverse covariance for ecological statistical inference. R package version 1.1.1.
[29] LENKOSKI, A. and DOBRA, A. (2011). Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior. J. Comput. Graph. Statist. 20 140-157. Supplementary material available online. · doi:10.1198/jcgs.2010.08181
[30] LEY, R. E. (2016). Prevotella in the gut: Choose carefully. Nature Reviews Gastroenterology & Hepatology 13 69-70.
[31] LIU, H., ROEDER, K. and WASSERMAN, L. (2010). Stability approach to regularization selection (StARS) for high dimensional graphical models. Adv. Neural Inf. Process. Syst. 24 1432-1440.
[32] Liu, H., Han, F., Yuan, M., Lafferty, J. and Wasserman, L. (2012). High-dimensional semiparametric Gaussian copula graphical models. Ann. Statist. 40 2293-2326. · Zbl 1297.62073 · doi:10.1214/12-AOS1037
[33] LIU, F., LI, P., CHEN, M., LUO, Y., PRABHAKAR, M., ZHENG, H., HE, Y., QI, Q., LONG, H. et al. (2017). Fructooligosaccharide (FOS) and galactooligosaccharide (GOS) increase Bifidobacterium but reduce butyrate producing bacteria with adverse glycemic metabolism in healthy young population. Sci. Rep. 7 1-12.
[34] LOZUPONE, C. A., STOMBAUGH, J. I., GORDON, J. I., JANSSON, J. K. and KNIGHT, R. (2012). Diversity, stability and resilience of the human gut microbiota. Nature 489 220-230.
[35] LYNCH, S. V. and PEDERSEN, O. (2016). The human intestinal microbiome in health and disease. N. Engl. J. Med. 375 2369-2379.
[36] MA, J. (2020). Joint microbial and metabolomic network estimation with the censored Gaussian graphical model. Stat. Biosci. 1-22.
[37] MARTINEZ, K. B., PIERRE, J. F. and CHANG, E. B. (2016). The gut microbiota: The gateway to improved metabolism. Gastroenterology Clinics 45 601-614.
[38] MARTINY, J. B., JONES, S. E., LENNON, J. T. and MARTINY, A. C. (2015). Microbiomes in light of traits: A phylogenetic perspective. Science 350 9323.
[39] MATTHEWS, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta, Protein Struct. 405 442-451.
[40] MCDAVID, A., GOTTARDO, R., SIMON, N. and DRTON, M. (2019). Graphical models for zero-inflated single cell gene expression. Ann. Appl. Stat. 13 848-873. · Zbl 1423.62148 · doi:10.1214/18-AOAS1213
[41] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082 · doi:10.1214/009053606000000281
[42] MITRA, R., MÜLLER, P., LIANG, S., YUE, L. and JI, Y. (2013). A Bayesian graphical model for ChIP-Seq data on histone modifications. J. Amer. Statist. Assoc. 108 69-80. · Zbl 1379.62079 · doi:10.1080/01621459.2012.746058
[43] MULGRAVE, J. J. and GHOSAL, S. (2020). Bayesian inference in nonparanormal graphical models. Bayesian Anal. 15 449-475. · Zbl 1459.62035 · doi:10.1214/19-BA1159
[44] MÜLLER, M., HERMES, G. D. A., CANFORA, E. E., SMIDT, H., MASCLEE, A. A. M., ZOETENDAL, E. G. and BLAAK, E. E. (2020). Distal colonic transit is linked to gut microbiota diversity and microbial fermentation in humans with slow colonic transit. Am. J. Physiol., Gasterointest. Liver Physiol. 318 G361-G369. · doi:10.1152/ajpgi.00283.2019
[45] NADERPOOR, N., MOUSA, A., GOMEZ-ARANGO, L. F., BARRETT, H. L., DEKKER NITERT, M. and DE COURTEN, B. (2019). Faecal microbiota are related to insulin sensitivity and secretion in overweight or obese adults. Journal of Clinical Medicine 8 452.
[46] NEWMAN, M. E. and GIRVAN, M. (2004). Finding and evaluating community structure in networks. Phys. Rev. E 69 026113.
[47] OH, J. K., CHAE, J. P., PAJARILLO, E. A. B., KIM, S. H., KWAK, M.-J., EUN, J.-S., CHEE, S. W., WHANG, K.-Y., KIM, S.-H. et al. (2020). Association between the body weight of growing pigs and the functional capacity of their gut microbiota. J. Anim. Sci. 91 e13418.
[48] OSBORNE, N., PETERSON, C. B. and VANNUCCI, M. (2022). Latent network estimation and variable selection for compositional data via variational EM. J. Comput. Graph. Statist. 31 163-175. · Zbl 07546467 · doi:10.1080/10618600.2021.1935971
[49] PARADIS, E. (2012). Analysis of Phylogenetics and Evolution with R, 2nd ed. Use R! Springer, New York. · Zbl 1286.92006 · doi:10.1007/978-1-4614-1743-9
[50] PARADIS, E. and SCHLIEP, K. (2019). ape 5.0: An environment for modern phylogenetics and evolutionary analyses in R. R package version 5.4.1.
[51] PEDROGO, D. A. M., JENSEN, M. D., VAN DYKE, C. T., MURRAY, J. A., WOODS, J. A., CHEN, J., KASHYAP, P. C. and NEHRA, V. (2018). Gut microbial carbohydrate metabolism hinders weight loss in overweight adults undergoing lifestyle intervention with a volumetric diet. In Mayo Clinic Proceedings 93 1104-1110. Elsevier.
[52] PERALTA, G. (2016). Merging evolutionary history into species interaction networks. Funct. Ecol. 30 1917-1925.
[53] PETERSON, C., STINGO, F. C. and VANNUCCI, M. (2015). Bayesian inference of multiple Gaussian graphical models. J. Amer. Statist. Assoc. 110 159-174. · Zbl 1373.62106 · doi:10.1080/01621459.2014.896806
[54] PFLUGHOEFT, K. J. and VERSALOVIC, J. (2012). Human microbiome in health and disease. Annu. Rev. Pathol. Mech. Dis. 7 99-122.
[55] RAMAYO-CALDAS, Y., MACH, N., LEPAGE, P., LEVENEZ, F., DENIS, C., LEMONNIER, G., LEPLAT, J.-J., BILLON, Y., BERRI, M. et al. (2016). Phylogenetic network analysis applied to pig gut microbiota identifies an ecosystem structure linked with growth traits. ISME J. 10 2973-2977.
[56] ROHR, R. P. and BASCOMPTE, J. (2014). Components of phylogenetic signal in antagonistic and mutualistic networks. Amer. Nat. 184 556-564.
[57] ROVERATO, A. (2002). Hyper inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand. J. Stat. 29 391-411. · Zbl 1036.62027 · doi:10.1111/1467-9469.00297
[58] THOMAZ, F. S., ALTEMANI, F., PANCHAL, S. K., WORRALL, S. and NITERT, M. D. (2021). The influence of wasabi on the gut microbiota of high-carbohydrate, high-fat diet-induced hypertensive Wistar rats. J. Hum. Hypertens. 35 170-180.
[59] VANDEPUTTE, D., FALONY, G., VIEIRA-SILVA, S., TITO, R. Y., JOOSSENS, M. and RAES, J. (2016). Stool consistency is strongly associated with gut microbiota richness and composition, enterotypes and bacterial growth rates. Gut 65 57-62. · doi:10.1136/gutjnl-2015-309618
[60] VANDEPUTTE, D., KATHAGEN, G., D’HOE, K., VIEIRA-SILVA, S., VALLES-COLOMER, M., SABINO, J., WANG, J., TITO, R. Y., DE COMMER, L. et al. (2017). Quantitative microbiome profiling links gut community variation to microbial load. Nature 551 507-511.
[61] VAN DEN BOGERT, B., ERKUS, O., BOEKHORST, J., GOFFAU, D. M., SMID, E. J., ZOETENDAL, E. G. and KLEEREBEZEM, M. (2013). Diversity of human small intestinal Streptococcus and Veillonella populations. FEMS Microbiol. Ecol. 85 376-388.
[62] VAN DEN BOGERT, B., MEIJERINK, M., ZOETENDAL, E. G., WELLS, J. M. and KLEEREBEZEM, M. (2014). Immunomodulatory properties of Streptococcus and Veillonella isolates from the human small intestine microbiota. PLoS ONE 9 e114277.
[63] Wang, H. (2012). Bayesian graphical lasso models and efficient posterior computation. Bayesian Anal. 7 867-886. · Zbl 1330.62041 · doi:10.1214/12-BA729
[64] WANG, H. (2015). Scaling it up: Stochastic search structure learning in graphical models. Bayesian Anal. 10 351-377. · Zbl 1335.62068 · doi:10.1214/14-BA916
[65] WANG, S., XIAO, Y., TIAN, F., ZHAO, J., ZHANG, H., ZHAI, Q. and CHEN, W. (2020). Rational use of prebiotics for gut microbiota alterations: Specific bacterial phylotypes and related mechanisms. J. Funct. Foods 66 103838.
[66] WASSERMAN, S., FAUST, K. et al. (1994). Social network analysis: Methods and applications. · Zbl 0926.91066
[67] XIAO, J., CHEN, L., JOHNSON, S., YU, Y., ZHANG, X. and CHEN, J. (2018). Predictive modeling of microbiome data using a phylogeny-regularized generalized linear mixed model. Front. Microbiol. 9 1391.
[68] YANG, X., YIN, F., YANG, Y., LEPP, D., YU, H., RUAN, Z., YANG, C., YIN, Y., HOU, Y. et al. (2018). Dietary butyrate glycerides modulate intestinal microbiota composition and serum metabolites in broilers. Sci. Rep. 8 1-12.
[69] YOON, G., CARROLL, R. J. and GAYNANOVA, I. (2020). Sparse semiparametric canonical correlation analysis for data of mixed types. Biometrika 107 609-625. · Zbl 1451.62051 · doi:10.1093/biomet/asaa007
[70] YOON, G., GAYNANOVA, I. and MÜLLER, C. L. (2019a). Microbial networks in SPRING—semi-parametric rank-based correlation and partial correlation estimation for quantitative microbiome data. Front. Genet. 10 516. · doi:10.3389/fgene.2019.00516
[71] YOON, G., GAYNANOVA, I. and MÜLLER, C. L. (2019b). Semi-parametric rank-based correlation and partial correlation estimation for quantitative microbiome data. R package version 1.0.4.
[72] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19-35. · Zbl 1142.62408 · doi:10.1093/biomet/asm018
[73] ZHANG, S. and CHEN, D.-C. (2019). Facing a new challenge: The adverse effects of antibiotics on gut microbiota and host immunity. Chin. Med. J. 132 1135.
[74] ZHENG, J., WITTOUCK, S., SALVETTI, E., FRANZ, C. M., HARRIS, H. M., MATTARELLI, P., O’TOOLE, P. W., POT, B., VANDAMME, P. et al. (2020). A taxonomic note on the genus Lactobacillus: Description of 23 novel genera, emended description of the genus Lactobacillus Beijerinck 1901, and union of Lactobacillaceae and Leuconostocaceae. Int. J. Syst. Evol. Microbiol. 70 2782-2858.
[75] ZHOU, J., VILES, W. D., LU, B., LI, Z., MADAN, J. C., KARAGAS, M. R., GUI, J. and HOEN, A. G. (2020). Identification of microbial interaction network: Zero-inflated latent Ising model based approach. BioData Mining 13 1-15.
[76] ZHOU, F., HE, K., LI, Q., CHAPKIN, R. S. and NI, Y. (2021). Bayesian biclustering for microbial metagenomic sequencing data via multinomial matrix factorization. Biostatistics 00 1-19. · doi:10.1093/biostatistics/kxab002
[77] ZOETENDAL, E. G., RAES, J., VAN DEN BOGERT, B., ARUMUGAM, M., BOOIJINK, C. C., TROOST, F. J., BORK, P., WELS, M., DE VOS, W. M. et al. (2012). The human small intestinal microbiota is driven by rapid uptake and conversion of simple carbohydrates. ISME J. 6 1415-1426
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.