Autoregressive models for gene regulatory network inference: sparsity, stability and causality issues. (English) Zbl 1308.92032

Summary: Reconstructing gene regulatory networks from high-throughput measurements represents a key problem in functional genomics. It also represents a canonical learning problem and thus has attracted a lot of attention in both the informatics and the statistical learning literature. Numerous approaches have been proposed, ranging from simple clustering to rather involved dynamic Bayesian network modeling, as well as hybrid ones that combine a number of modeling steps, such as employing ordinary differential equations coupled with genome annotation. These approaches are tailored to the type of data being employed. Available data sources include static steady state data and time course data obtained either for wild type phenotypes or from perturbation experiments.
This review focuses on the class of autoregressive models using time course data for inferring gene regulatory networks. The central themes of sparsity, stability and causality are discussed as well as the ability to integrate prior knowledge for successful use of these models for the learning task at hand.


92C40 Biochemistry, molecular biology
60J20 Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.)
Full Text: DOI


[1] Licatalosi, D.; Darnell, R. B., RNA processing and its regulation: global insights into biological networks, Nature Reviews Genetics, 75, (2010)
[2] Aebersold, R.; Mann, M., Mass spectrometry-based proteomics, Nature, 198, (2003)
[3] Dettmer, K.; Aronov, P.; Hammock, B., Mass spectrometry-based metabolomics, Mass Spectrom Reviews, 26, 51, (2007)
[4] Michailidis, G., Statistical challenges in biological networks, Journal of Computational and Graphical Statistics, 21, 840, (2012)
[5] Voit, E., Modelling metabolic networks using power-laws and s-systems, Essays Biochemistry, 29, (2008)
[6] Brouard, C.; d’Alché Buc, F.; Szafranski, M., Semi-supervised penalized output kernel regression for link prediction, (Getoor, L.; Scheffer, T., ICML, (2011), Omnipress), 593
[7] Q.C. Zhang, D. Petrey, L. Deng, L. Qiang, Y. Shi, C. Thu, B. Bisikirska, C. Lefebvre, D. Accili, T. Hunter, et al., Structure-based prediction of protein-protein interactions on a genome-wide scale, Nature 490 (2012) 556.
[8] Sima, C.; Hua, J.; Jung, S., Inference of gene regulatory networks using time-series data: a survey, Current Genomics, 416, (2009)
[9] Chou, I. C.; Voit, E. O., Recent developments in parameter estimation and structure identification of biochemical and genomic systems, Mathematical Biosciences, 219, 57, (2009) · Zbl 1168.92019
[10] Lawrence, N.; Girolami, M.; Rattray, M.; Sanguinetti, G., Learning and inference in computational systems biology, (2010), MIT Press · Zbl 1196.92018
[11] Auliac, C.; Frouin, V.; Gidrol, X.; d’Alché Buc, F., Evolutionary approaches for the reverse-engineering of gene regulatory networks: a study on a biologically realistic dataset, BMC Bioinformatics, 9, 91, (2008)
[12] Mordelet, F.; Vert, J.-P., Sirene: supervised inference of regulatory networks, Bioinformatics, 24, i76, (2008)
[13] C. Brouard, J. Dubois, C. Vrain, D. Castel, M.-A. Debily, F. d’Alché Buc, Learning a markov logic network for supervised inference of a gene regulatory network: application to the id2 regulatory network in human keratinocytes, BMC Bioinformatics, to appear, 2013.
[14] Perrin, B.-E.; Ralaivola, L.; Mazurie, A.; Bottani, S.; Mallet, J.; d’Alché Buc, F., Gene networks inference using dynamic Bayesian networks, Bioinformatics, 19, 38, (2003)
[15] Hartemink, A., Reverse engineering gene regulatory networks, Nature Biotechnology, 23, 554, (2005)
[16] Bansal, M.; Della Gatta, G.; di Bernardo, D., Inference of gene regulatory networks and compound mode of action from time course gene expression profiles, Bioinformatics, 22, 815, (2006)
[17] A. Fujita, J. Sato, H. Garay-Malpartida, R. Yamaguchi, S. Miyano, M. Sogayar, C.E. Ferreira, Modeling gene expression regulatory networks with the sparse vector autoregressive model, BMC Systems Biology 1 (2007), Article 39.
[18] Shojaie, A.; Michailidis, G., Penalized likelihood methods for estimation of sparse high dimensional directed acyclic graphs, Biometrika, 97, 3, 519, (2010) · Zbl 1195.62090
[19] S. Basu, A. Shojaie, G. Michailidis, Network granger causality with inherent grouping structure, 2012, 1. ArXiv:1210.3711v3. · Zbl 1360.62312
[20] Lim, N.; Senbabaoglu, Y.; Michailidis, G.; d’Alché Buc, F., Okvar-boost: a novel boosting algorithm to infer nonlinear dynamics and interactions in gene regulatory networks, Bioinformatics, 29, 1416, (2013)
[21] Margolin, A. A.; Nemenman, I.; Basso, K.; Wiggins, C.; Stolovitzky, G.; Favera, R.; Califano, A., Aracne: an algorithm for the reconstruction of gene regulatory networks in a Mammalian cellular context, BMC Bioinformatics, 7, S7, (2006)
[22] Zoppoli, P.; Morganella, S.; Ceccarelli, M., Timedelay-aracne: reverse engineering of gene networks from time-course data by an information theoretic approach, BMC Bioinformatics, 11, 154, (2010)
[23] Meinshausen, N.; Bühlmann, P., Stability selection (with discussion), Journal of the Royal Statistical Society:Series B, 417, (2010)
[24] V.A. Huynh-Thu, A. Irrthum, L. Wehenkel, P. Geurts, Inferring regulatory networks from expression data using tree-based methods, PLos ONE 5 (2010) e12776.
[25] A.-C. Haury, F. Mordelet, P. Vera-Licona, J.-P. Vert, Tigress: Trustful inference of gene regulation using stability selection, BMC Systems Biology 6 (2012), Article 145.
[26] Mukherjee, S.; Speed, T., Network inference using informative priors, Proceedings of the National Academy of Sciences, 105, 14313, (2008)
[27] Pearl, J., Causality: models, reasoning, and inference, (2000), Cambridge Univ Press · Zbl 0959.68116
[28] Gardner, T. S.; di Bernardo, D.; Lorenz, D.; Collins, J. J., Inferring genetic networks and identifying compound mode of action via expression profiling, Science, 301, 102, (2003)
[29] Tegner, J.; Yeung, M. K.; Hasty, J.; Collins, J. J., Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling, Proceedings of the National Academy of Sciences USA, 100, 5944, (2003)
[30] Granger, C. W.J., Investigating causal relations by econometric models and cross-spectral methods, Econometrica, 37, 424, (1969) · Zbl 1366.91115
[31] Karlebach, G.; Shamir, R., Modelling and analysis of gene regulatory networks, Nature Reviews Molecular Cell Biology, 9, 770, (2008)
[32] Batt, G.; Ropers, D.; Jong, H. D.; Geiselmann, J.; Mateescu, R.; Schneider, D., Validation of qualitative models of genetic regulatory networks by model checking: analysis of the nutritional stress response in Escherichia coli, Bioinformatics, 21, 19, (2005)
[33] Voit, E.; Savageau, M., Power-law approach to modeling biological systems; iii. methods of analysis, Journal of Fermentation Technology, 60, 233, (1982)
[34] Voit, E., Computational analysis of biochemical systems: A practical guide for biochemists and molecular biologists, (2000), Cambridge University Press Cambridge; New York
[35] Vilela, M.; Chou, I.-C.; Vinga, S.; Vasconcelos, A.; Voit, E.; Almeida, J., Parameter optimization in s-system models, BMC Systems Biology, 2, 35, (2008)
[36] Shojaie, A.; Michailidis, G., Discovering graphical Granger causality using a truncating lasso penalty, Bioinformatics, 26, 18, i517, (2010)
[37] Lozano, A. C.; Abe, N.; Liu, Y.; Rosset, S., Grouped graphical Granger modeling for gene expression regulatory networks discovery, Bioinformatics, 25, i110, (2009)
[38] Lütkepohl, H., New introduction to multiple time series analysis, (2005), Springer · Zbl 1072.62075
[39] Mukhopadhyay, N.; Chatterjee, S., Causality and pathway search in microarray time series experiment, Bioinformatics, 23, 442, (2007)
[40] Benjamini, Y.; Hochberg, Y., Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society Series B Methodological, 57, 289, (1995) · Zbl 0809.62014
[41] Barabasi, A.-L.; Albert, R., Emergence of scaling in random networks, Science, 286, 11, (1999)
[42] T. Shimamura, S. Imoto, R. Yamaguchi, A. Fujita, M. Nagasaki, S. Miyano, Recursive regularization for inferring gene networks from time-course gene expression profiles, BMC Systems Biology, 2009.
[43] Opgen-Rhein, R.; Strimmer, K., Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process, BMC Bioinformatics, 8, S3, (2007)
[44] Buhlmann, P.; van de Geer, S., Statistics for high-dimensional data: methods, theory and applications, (2011), Springer · Zbl 1273.62015
[45] Kim, S.; Imoto, S.; Miyano, S., Dynamic Bayesian network and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data, Biosystems, 75, 57, (2004)
[46] Morrissey, E. R.; Jurez, M. A.; Denby, K. J.; Burroughs, N. J., Inferring the time-invariant topology of a nonlinear sparse gene regulatory network using fully Bayesian spline autoregression, Biostatistics, 12, 682, (2011) · Zbl 1314.62244
[47] A. Fouchet, J.-M. Delosme, F. d’Alché Buc, Gene regulatory network inference using ensemble of multiple local kernel models, Programme of Seventh International Workshop on Machine Learning in Systems Biology, satellite meeting of ISMB’2013, Uwe Owler and Jean-Philippe Vert, July 19-20, 2013.
[48] Gonen, M.; Alpaydyn, E., Multiple kernel learning algorithms, JMLR, 12, 2211, (2011) · Zbl 1280.68167
[49] Rakotomamonjy, A.; Bach, F.; Canu, S.; Grandvalet, Y., Simplemkl, Journal of Machine Learning Research, 9, 2491, (2008) · Zbl 1225.68208
[50] Dojer, N., Learning Bayesian networks does not have to be NP-hard, Proceedings of International Symposium on Mathematical Foundations of Computer Science, 305, (2006) · Zbl 1132.68537
[51] Xuan, N.; Chetty, M.; Coppel, R.; Wangikar, P., Gene regulatory network modeling via global optimization of high-order dynamic Bayesian network, BMC Bioinformatics, 13, 131, (2012)
[52] Tresch, A.; Markowetz, F., Structure learning in nested effects models, Statistical Applications in Genetics and Molecular Biology, 7, 9, (2008) · Zbl 1276.92075
[53] Eaton, D.; Murphy, K. P., Exact Bayesian structure learning from uncertain interventions, Journal of Machine Learning Research - Proceedings Track, 2, 107, (2007)
[54] S. Spencer, S. Hill, S. Mukherjee, Dynamic Bayesian networks for interventional data, Technical Report, Warwick University, UK, 2012.
[55] Rajapakse, J. C.; Mundra, P. A., Stability of building gene regulatory networks with sparse autoregressive models, BMC Bioinformatics, 12, S17, (2011)
[56] de Matos Simoes, R.; Emmert-Streib, F., Bagging statistical network inference from large-scale gene expression data, PLoS One, 7, e33624, (2012)
[57] Marbach, D.; Costello, J. C.; Kuffner, R.; Vega, N. M.; Prill, R. J.; Camacho, D. M.; Allison, K. R.; Consortium, T. D.; Kellis, M.; Collins, J. J., Wisdom of crowds for robust gene network inference, Nature Methods, 9, 796, (2012)
[58] Albert, R., Scale-free networks in cell biology, Journal of Cell Science, 118, 4947, (2005)
[59] Valouev, A.; Johnson, D.; Sundquist, A.; Medina, C.; Anton, E.; Batzoglou, S.; Myers, R.; Sidow, A., Genome-wide analysis of transcription factor binding sites based on chip-seq data, Nature Methods, 5, 829, (2008)
[60] Wingender, E., The transfac project as an example of framework technology that supports the analysis of genomic regulation, Briefings in Bioinformatics, 9, 326, (2008)
[61] Weber, M.; Henkel, S.; Vlaic, S.; Guthke, R.; van Zoelen, E.; Driesch, D., Inference of dynamical gene-regulatory networks based on time-resolved multi-stimuli multi-experiment data applying netgenerator v2.0, BMC Systems Biology, 7, 1, (2013)
[62] Zheng, J.; Chaturvedi, I.; Rajapakse, J. C., Integration of epigenetic data in Bayesian network modeling of gene regulatory network, (Loog, M.; Wessels, L. F.A.; Reinders, M. J.T.; de Ridder, D., PRIB, Lecture Notes in Computer Science, 7036, (2011), Springer), 87
[63] Pinna, A.; Soranzo, N.; de la Fuente, A., From knockouts to networks: establishing direct cause-effect relationships through graph analysis, PLoS ONE, 5, e12912, (2010)
[64] Greenfield, A.; Hafemeister, C.; Bonneau, R., Robust data-driven incorporation of prior knowledge into the inference of dynamic regulatory networks, Bioinformatics, 29, 1060, (2013)
[65] Imoto, S.; Higuchi, T.; Goto, T.; Tashiro, K.; Kuhara, S.; Miyano, S., Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks, (Proceedings of the IEEE Computer Society Bioinformatics Conference (CSB 03), (2003), IEEE), 104
[66] A. Werhli, D. Husmeier, Reconstructing gene regulatory networks with bayesian networks by combining expression data with multiple sources of prior knowledge, Statistical Applications in Genetics and Molecular Biology 6 (2007), Article 15. · Zbl 1166.62373
[67] Bock, M.; Ogishima, S.; Tanaka, H.; Kramer, S.; Kaderali, L., Hub-centered gene network reconstruction using automatic relevance determination, PLoS ONE, 7, e35077, (2012)
[68] Marbach, D.; Prill, R. J.; Schaffter, T.; Mattiussi, C.; Floreano, D.; Stolovitzky, G., Revealing strengths and weaknesses of methods for gene network inference, Proceedings of the National Academy of Sciences USA, 107, 14, 6286, (2010)
[69] Prill, R. J.; Saez-Rodriguez, J.; Alexopoulos, L. G.; Sorger, P. K.; Stolovitzky, G., Crowdsourcing network inference: the dream predictive signaling network challenge, Science Signaling, 4, mr7, (2011)
[70] Gupta, R.; Stincone, A.; Antczak, P.; Durant, S.; Bicknell, R.; Bikfalvi, A.; Falciani, F., A computational framework for gene regulatory network inference that combines multiple methods and datasets, BMC Systems Biology, 5, 52, (2011)
[71] Scholkopf, B.; Tsuda, T.; Vert, J.-P., Kernel methods in computational biology, (2004), The MIT press
[72] Davidson, E. H.; Levine, M. S., Properties of developmental gene regulatory networks, Proceedings of the National Academy of Sciences, 105, 20063, (2008)
[73] Segal, E.; Shapira, M.; Regev, A.; Pe’er, D.; Botstein, D.; Koller, D.; Friedman, N., Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data, Nature Genetics, 34, 2, 166, (2003)
[74] Steiert, B.; Raue, A.; Timmer, J.; Kreutz, C., Experimental design for parameter estimation of gene regulatory networks, PLoS ONE, 7, e40052, (2012)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.