×

zbMATH — the first resource for mathematics

Inferring dynamic gene regulatory networks with low-order conditional independencies – an evaluation of the method. (English) Zbl 1461.92033
Summary: Over a decade ago, S. Lèbre [Stat. Appl. Genet. Mol. Biol. 8, No. 1, Article No. 9, 38 p. (2009; Zbl 1276.62080)] proposed an inference method, G1DBN, to learn the structure of gene regulatory networks (GRNs) from high dimensional, sparse time-series gene expression data. Their approach is based on concept of low-order conditional independence graphs that they extend to dynamic Bayesian networks (DBNs). They present results to demonstrate that their method yields better structural accuracy compared to the related Lasso and Shrinkage methods, particularly where the data is sparse, that is, the number of time measurements \(n\) is much smaller than the number of genes \(p\). This paper challenges these claims using a careful experimental analysis, to show that the GRNs reverse engineered from time-series data using the G1DBN approach are less accurate than claimed by [loc. cit.]. We also show that the Lasso method yields higher structural accuracy for graphs learned from the simulated data, compared to the G1DBN method, particularly when the data is sparse \((n \ll p)\). The Lasso method is also better than G1DBN at identifying the transcription factors (TFs) involved in the cell cycle of Saccharomyces cerevisiae.
MSC:
92C42 Systems biology, networks
92D10 Genetics and epigenetics
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Altman, N., and Krzywinski, M. (2018). The curse(s) of dimensionality. Nat. Methods 15: 399-400, doi:10.1038/s41592-018-0019-x.
[2] Bernard, A., and Hartemink, A.J. (2005). Informative structure priors: joint learning of dynamic regulatory networks from multiple types of data. In: Biocomputing. World Scientific, Hawaii, USA, pp. 459-470.
[3] Campos, L.M.d. (2006). A scoring function for learning Bayesian networks based on mutual information and conditional independence tests. J. Mach. Learn. Res. 7: 2149-2187. · Zbl 1222.62036
[4] Chai, L.E., Loh, S.K., Low, S.T., Mohamad, M.S., Deris, S., and Zakaria, Z. (2014). A review on the computational approaches for gene regulatory network construction. Comput. Biol. Med. 48: 55-65, doi:10.1016/j.compbiomed.2014.02.011.
[5] Charbonnier, C., Chiquet, J., and Ambroise, C. (2010). Weighted-LASSO for structured network inference from time course data. Stat. Appl. Genet. Mol. Biol. 9, doi:10.2202/1544-6115.1519. · Zbl 1304.92085
[6] Chaturvedi, I., and Rajapakse, J.C. (2010). Building gene networks with time-delayed regulations. Pattern Recogn. Lett. 31: 2133-2137, doi:10.1016/j.patrec.2010.03.002.
[7] Cho, K.-H., Choo, S.-M., Jung, S., Kim, J.-R., Choi, H.-S., and Kim, J. (2007). Reverse engineering of gene regulatory networks. IET Syst. Biol. 1: 149-163, doi:10.1049/iet-syb:20060075.
[8] Csala, A., Voorbraak, F.P., Zwinderman, A.H., and Hof, M.H. (2017). Sparse redundancy analysis of high-dimensional genetic and genomic data. Bioinformatics 33: 3228-3234, doi:10.1093/bioinformatics/btx374.
[9] De Campos, C.P. and Ji, Q. (2011). Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12: 663-689. · Zbl 1280.68226
[10] Delgado, F.M., and Gómez-Vela, F. (2019). Computational methods for gene regulatory networks reconstruction and analysis: a review. Artif. Intell. Med. 95: 133-145, doi:10.1016/j.artmed.2018.10.006.
[11] D’haeseleer, P., Wen, X., Fuhrman, S., and Somogyi, R. (1999). Linear modeling of mRNA expression levels during CNS development and injury. In: Biocomputing’99. World Scientific, Hawaii, USA, pp. 41-52.
[12] Dojer, N., Gambin, A., Mizera, A., Wilczyński, B., and Tiuryn, J. (2006). Applying dynamic Bayesian networks to perturbed gene expression data. BMC Bioinf. 7, doi:10.1186/1471-2105-7-249.
[13] Dondelinger, F., Lèbre, S., and Husmeier, D. (2010). Heterogeneous continuous dynamic Bayesian networks with flexible structure and inter-time segment information sharing. In: Proceedings of the 27th international conference on international conference on machine learning. Omnipress, Haifa, Israel, pp. 303-310.
[14] Dondelinger, F., Lèbre, S., and Husmeier, D. (2013). Non-homogeneous dynamic Bayesian networks with Bayesian regularization for inferring gene regulatory networks with gradually time-varying structure. Mach. Learn. 90: 191-230, doi:10.1007/s10994-012-5311-x. · Zbl 1260.92027
[15] Dong, X., Yambartsev, A., Ramsey, S.A., Thomas, L.D., Shulzhenko, N., and Morgun, A. (2015). Reverse enGENEering of regulatory networks from big data: a roadmap for biologists. Bioinf. Biol. Insights 9, BBI-S12467, doi:10.4137/bbi.s12467.
[16] Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). Least angle regression. Ann. Stat. 32: 407-499. · Zbl 1091.62054
[17] Ekstrøm, C.T. (2020). MESS: miscellaneous esoteric statistical scripts, R package version 0.5.7.
[18] Enright, C.G., and Madden, M.G. (2015). Modelling and monitoring the individual patient in real time. Springer, Cham, Switzerland, pp. 107-136.
[19] Friedman, N., Linial, M., Nachman, I., and Pe’er, D. (2004). Using Bayesian networks to analyze expression data. J. Comput. Biol. 7: 601-620.
[20] Friedman, N., Murphy, K.P., and Russell, S.J. (1998). Learning the structure of dynamic probabilistic networks. In: Cooper, G. F. and Moral, S. (Eds.), UAI ’98: proceedings of the fourteenth conference on uncertainty in artificial intelligence. University of Wisconsin Business School, Madison, Wisconsin, USA, pp. 139-147. Morgan Kaufmann.
[21] Gardner, T.S., Di Bernardo, D., Lorenz, D., and Collins, J.J. (2003). Inferring genetic networks and identifying compound mode of action via expression profiling. Science 301: 102-105, doi:10.1126/science.1081900.
[22] Grzegorczyk, M., and Husmeier, D. (2009). Non-stationary continuous dynamic Bayesian networks. In: Advances in neural information processing systems. Curran Associates, Inc., Vancouver, Canada, pp. 682-690.
[23] Grzegorczyk, M., and Husmeier, D. (2011). Non-homogeneous dynamic Bayesian networks for continuous data. Mach. Learn. 83: 355-419, doi:10.1007/s10994-010-5230-7. · Zbl 1274.62201
[24] Grzegorczyk, M., and Husmeier, D. (2012). A non-homogeneous dynamic Bayesian network with sequentially coupled interaction parameters for applications in systems and synthetic biology. Stat. Appl. Genet. Mol. Biol. 11, doi:10.1515/1544-6115.1761. · Zbl 1296.92039
[25] Halbersberg, D., and Lerner, B. (2020). Local to global learning of a latent dynamic Bayesian network. In: 24th European conference on artificial intelligence - ECAI 2020. IOS Press, Santiago de Compostela, Spain.
[26] Hartemink, A.J. (2005). Reverse engineering gene regulatory networks. Nat. Biotechnol. 23: 554-555, doi:10.1038/nbt0505-554.
[27] Hastie, T. and Efron, B. (2013). Least angle regression, Lasso and forward stagewise, R package version 1.2.
[28] Heckerman, D., Geiger, D., and Chickering, D.M. (1995). Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20: 197-243, doi:10.1007/bf00994016. · Zbl 0831.68096
[29] Hill, S.M., Lu, Y., Molina, J., Heiser, L.M., Spellman, P.T., Speed, T.P., Gray, J.W., Mills, G.B., and Mukherjee, S. (2012). Bayesian inference of signaling network topology in a cancer cell line. Bioinformatics 28: 2804-2810, doi:10.1093/bioinformatics/bts514.
[30] Hurd, P.J., and Nelson, C.J. (2009). Advantages of next-generation sequencing versus the microarray in epigenetic research. Briefings Funct. Genomics Proteomics 8: 174-183, doi:10.1093/bfgp/elp013.
[31] Husmeier, D., Dondelinger, F., and Lèbre, S. (2010). Inter-time segment information sharing for non-homogeneous dynamic Bayesian networks. In: Advances in neural information processing systems. Curran Associates, Inc., Vancouver, Canada, pp. 901-909.
[32] Iglesias-Martinez, L.F., Kolch, W., and Santra, T. (2016). BGRMI: a method for inferring gene regulatory networks from time-course gene expression data and its application in breast cancer research. Sci. Rep. 6: 37140, doi:10.1038/srep37140.
[33] Imoto, S., Higuchi, T., Goto, T., Tashiro, K., Kuhara, S., and Miyano, S. (2004). Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks. J. Bioinf. Comput. Biol. 2: 77-98, doi:10.1142/s021972000400048x.
[34] Jassal, B., Matthews, L., Viteri, G., Gong, C., Lorente, P., Fabregat, A., Sidiropoulos, K., Cook, J., Gillespie, M., Haw, R., et al.. (2019). The reactome pathway knowledgebase. Nucleic Acids Res. 48: D498-D503.
[35] Kanehisa, M., and Goto, S. (2000). KEGG: Kyoto Encyclopedia of genes and genomes. Nucleic Acids Res. 28: 27-30, doi:10.1093/nar/28.1.27.
[36] Kim, S.Y., Imoto, S., and Miyano, S. (2003). Inferring gene networks from time series microarray data using dynamic Bayesian networks. Briefings Bioinf. 4: 228-235, doi:10.1093/bib/4.3.228.
[37] Kim, S., Imoto, S., and Miyano, S. (2004). Dynamic Bayesian network and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data. Biosystems 75: 57-65, doi:10.1016/j.biosystems.2004.03.004.
[38] Koivisto, M. and Sood, K. (2004). Exact Bayesian structure discovery in Bayesian networks. J. Mach. Learn. Res. 5: 549-573. · Zbl 1222.68234
[39] Koranda, M., Schleiffer, A., Endler, L., and Ammerer, G. (2000). Forkhead-like transcription factors recruit ndd1 to the chromatin of g2/m-specific promoters. Nature 406: 94, doi:10.1038/35017589.
[40] Lähdesmäki, H., and Shmulevich, I. (2008). Learning the structure of dynamic Bayesian networks from time series and steady state measurements. Mach. Learn. 71: 185-217, doi:10.1007/s10994-008-5053-y.
[41] Lähdesmäki, H., Shmulevich, I., and Yli-Harja, O. (2003). On learning gene regulatory networks under the Boolean network model. Mach. Learn. 52: 147-167, doi:10.1023/a:1023905711304. · Zbl 1039.68162
[42] Lèbre, S. (2009). Inferring dynamic genetic networks with low order independencies. Stat. Appl. Genet. Mol. Biol. 8, doi:10.2202/1544-6115.1294. · Zbl 1276.62080
[43] Lèbre, S., Becq, J., Devaux, F., Stumpf, M.P., and Lelandais, G. (2010). Statistical inference of the time-varying structure of gene-regulation networks. BMC Syst. Biol. 4: 130, doi:10.1186/1752-0509-4-130.
[44] Lébre, S. and Chiquet, J. (2012). G1DBN: a package performing dynamic Bayesian network inference, R package version 3.1.1.
[45] Lèbre, S., Dondelinger, F., and Husmeier, D. (2012). Nonhomogeneous dynamic Bayesian networks in systems biology. In: Next generation microarray bioinformatics. Springer, Clifton, USA, pp. 199-213.
[46] Li, Z., Li, P., Krishnan, A., and Liu, J. (2011). Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic Bayesian network analysis. Bioinformatics 27: 2686-2691, doi:10.1093/bioinformatics/btr454.
[47] Li, Y., and Ngom, A. (2013). The max-min high-order dynamic Bayesian network learning for identifying gene regulatory networks from time-series microarray data. In: 2013 IEEE symposium on computational intelligence in bioinformatics and computational biology (CIBCB). IEEE, Singapore, pp. 83-90.
[48] Liu, C., Jiang, J., Gu, J., Yu, Z., Wang, T., and Lu, H. (2016). High-dimensional omics data analysis using a variable screening protocol with prior knowledge integration (SKI). BMC Syst. Biol. 10: 118, doi:10.1186/s12918-016-0358-0.
[49] Ma, S., Kemmeren, P., Gresham, D., and Statnikov, A. (2014). De-novo learning of genome-scale regulatory networks in s. cerevisiae. PloS One 9: e106479, doi:10.1371/journal.pone.0106479.
[50] Margaritis, D., and Thrun, S. (2000). Bayesian network induction via local neighborhoods. In: Advances in neural information processing systems. The MIT Press, Denver, USA, pp. 505-511.
[51] Margolin, A.A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Dalla Favera, R., and Califano, A. (2006). ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinf. 7: S7, doi:10.1186/1471-2105-7-s1-s7.
[52] Mihajlovic, V., and Petkovic, M. (2001). Dynamic Bayesian networks: a state of the art. University of Twente, Enschede, the Netherlands.
[53] Murphy, K. and Mian, S. (1999). Modelling gene expression data using dynamic Bayesian networks, Technical report, Technical report. Computer Science Division, University of California, Berkeley.
[54] Opgen-Rhein, R., and Strimmer, K. (2007). Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process. BMC Bioinf. 8: S3, doi:10.1186/1471-2105-8-s2-s3.
[55] Pearl, J. (2014). Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, Burlington, USA.
[56] Peña, J.M., Björkegren, J., and Tegnér, J. (2005). Learning dynamic Bayesian network models via cross-validation. Pattern Recogn. Lett. 26: 2295-2308, doi:10.1016/j.patrec.2005.04.005.
[57] Perrin, B.-E., Ralaivola, L., Mazurie, A., Bottani, S., Mallet, J., and d’Alche Buc, F. (2003). Gene networks inference using dynamic Bayesian networks. Bioinformatics 19: ii138-ii148, doi:10.1093/bioinformatics/btg1071.
[58] Pirgazi, J., and Khanteymoori, A.R. (2018). A robust gene regulatory network inference method base on Kalman filter and linear regression. PloS One 13: e0200094, doi:10.1371/journal.pone.0200094.
[59] Rajapakse, J.C., and Chaturvedi, I. (2010). Gene regulatory networks with variable-order dynamic Bayesian networks. In: The 2010 international joint conference on neural networks (IJCNN). IEEE, Barcelona, Spain, pp. 1-5.
[60] Robinson, J.W., and Hartemink, A.J. (2009). Non-stationary dynamic Bayesian networks. In: Advances in neural information processing systems. Curran Associates, Inc., Vancouver, Canada, pp. 1369-1376.
[61] Robinson, J.W., Hartemink, A.J., and Ghahramani, Z. (2010). Learning non-stationary dynamic Bayesian networks. J. Mach. Learn. Res. 11. · Zbl 1242.68244
[62] Schrynemackers, M., Küffner, R., and Geurts, P. (2013). On protocols and measures for the validation of supervised methods for the inference of biological networks. Front. Genet. 4: 262, doi:10.3389/fgene.2013.00262.
[63] Shafiee Kamalabad, M. and Grzegorczyk, M. (2019). Non-homogeneous dynamic Bayesian networks with edge-wise sequentially coupled parameters. Bioinformatics 36: 1198-1207.
[64] Shermin, A., and Orgun, M.A. (2009). Using dynamic Bayesian networks to infer gene regulatory networks from expression profiles. In: Proceedings of the 2009 ACM symposium on applied computing. Association for Computing Machinery, New York, NY, United States, Honolulu, Hawaii, USA, pp. 799-803.
[65] Shmelkov, E., Tang, Z., Aifantis, I., and Statnikov, A. (2011). Assessing quality and completeness of human transcriptional regulatory pathways on a genome-wide scale. Biol. Direct 6: 15, doi:10.1186/1745-6150-6-15.
[66] Song, L., Kolar, M., and Xing, E.P. (2009). Time-varying dynamic Bayesian networks. In: Advances in neural information processing systems. Curran Associates, Inc., Vancouver, Canada, pp. 1732-1740.
[67] Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., and Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9: 3273-3297, doi:10.1091/mbc.9.12.3273.
[68] Tastan, O., Qi, Y., Carbonell, J.G., and Klein-Seetharaman, J. (2009). Prediction of interactions between hiv-1 and human proteins by information integration. In: Biocomputing 2009. World Scientific, Hawaii, USA, pp. 516-527.
[69] Teixeira, M.C., Monteiro, P., Jain, P., Tenreiro, S., Fernandes, A.R., Mira, N.P., Alenquer, M., Freitas, A.T., Oliveira, A.L., and Sa-Correia, I. (2006). The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Nucleic Acids Res. 34: D446-D451, doi:10.1093/nar/gkj013.
[70] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. B 58: 267-288, doi:10.1111/j.2517-6161.1996.tb02080.x. · Zbl 0850.62538
[71] Tsai, M.-J., Wang, J.-R., Ho, S.-J., Shu, L.-S., Huang, W.-L., and Ho, S.-Y. (2020). GREMA: modelling of emulated gene regulatory networks with confidence levels based on evolutionary intelligence to cope with the underdetermined problem. Bioinformatics 36: 3833-3840, doi:10.1109/icce-taiwan49838.2020.9258351.
[72] Tucker, A., Liu, X., and Ogden-Swift, A. (2001). Evolutionary learning of dynamic probabilistic models with large time lags. Int. J. Intell. Syst. 16: 621-645, doi:10.1002/int.1027. · Zbl 0979.68083
[73] Vinh, N.X., Chetty, M., Coppel, R., and Wangikar, P.P. (2011). GlobalMIT: learning globally optimal dynamic Bayesian network with the mutual information test criterion. Bioinformatics 27: 2765-2766, doi:10.1093/bioinformatics/btr457.
[74] Vinh, N.X., Chetty, M., Coppel, R., and Wangikar, P.P. (2012a). Gene regulatory network modeling via global optimization of high-order dynamic Bayesian network. BMC Bioinf. 13: 131.
[75] Vinh, N.X., Chetty, M., Coppel, R., and Wangikar, P.P. (2012b). Local and global algorithms for learning dynamic Bayesian networks. In: 2012 IEEE 12th international conference on data mining. IEEE, Brussels, pp. 685-694.
[76] Vohradsky, J. (2001). Neural network model of gene expression. Faseb. J. 15: 846-854, doi:10.1096/fj.00-0361com.
[77] Wexler, E.M., Rosen, E., Lu, D., Osborn, G.E., Martin, E., Raybould, H., and Geschwind, D.H. (2011). Genome-wide analysis of a Wnt1-regulated transcriptional network implicates neurodegenerative pathways. Sci. Signal. 4: ra65-ra65, doi:10.1126/scisignal.2002282.
[78] Wille, A., and Bühlmann, P. (2006). Low-order conditional independence graphs for inferring genetic networks. Stat. Appl. Genet. Mol. Biol. 5, doi:10.2202/1544-6115.1170. · Zbl 1166.62374
[79] Xing, L., Guo, M., Liu, X., Wang, C., and Zhang, L. (2018). Gene regulatory networks reconstruction using the flooding-pruning hill-climbing algorithm. Genes 9: 342, doi:10.3390/genes9070342.
[80] Xing, Z., and Wu, D. (2006). Modeling multiple time units delayed gene regulatory network using dynamic Bayesian network. In: Sixth IEEE international conference on data mining-workshops (ICDMW’06). IEEE, Hong Kong, pp. 190-195.
[81] Xu, C., and Jackson, S.A. (2019). Machine learning and complex biological data. Genome Biol. 20, doi:10.1186/s13059-019-1689-0.
[82] Zhang, Y., Deng, Z., Jiang, H., and Jia, P. (2006). Dynamic Bayesian network (DBN) with structure expectation maximization (SEM) for modeling of gene network from time series gene expression data. In: BIOCOMP. CSREA Press, Las Vegas, USA, pp. 41-47.
[83] Zhang, Y., Deng, Z., Jiang, H., and Jia, P. (2007). Inferring gene regulatory networks from multiple data sources via a dynamic Bayesian network with structural EM. In: Cohen-Boulakia, S. and Tannen, V. (Eds.), Data integration in the life sciences, pp. 204-214. Springer Berlin Heidelberg, Berlin, Heidelberg.
[84] Zou, M., and Conzen, S.D. (2004). A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 21: 71-79, doi:10.1093/bioinformatics/bth463.
[85] Zuo, Y., Yu, G., Tadesse, M.G., and Ressom, H.W. (2014). Biological network inference using low order partial correlation. Methods 69: 266-273, doi:10.1016/j.ymeth.2014.06.010.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.