Structure optimization for large gene networks based on greedy strategy. (English) Zbl 1411.92095

Summary: In the last few years, gene networks have become one of most important tools to model biological processes. Among other utilities, these networks visually show biological relationships between genes. However, due to the large amount of the currently generated genetic data, their size has grown to the point of being unmanageable. To solve this problem, it is possible to use computational approaches, such as heuristics-based methods, to analyze and optimize gene network’s structure by pruning irrelevant relationships. In this paper we present a new method, called GeSOp, to optimize large gene network structures. The method is able to perform a considerably prune of the irrelevant relationships comprising the input network. To do so, the method is based on a greedy heuristic to obtain the most relevant subnetwork. The performance of our method was tested by means of two experiments on gene networks obtained from different organisms. The first experiment shows how GeSOp is able not only to carry out a significant reduction in the size of the network, but also to maintain the biological information ratio. In the second experiment, the ability to improve the biological indicators of the network is checked. Hence, the results presented show that GeSOp is a reliable method to optimize and improve the structure of large gene networks.


92C40 Biochemistry, molecular biology
92C42 Systems biology, networks
92-04 Software, source code, etc. for problems pertaining to biology
Full Text: DOI


[1] Wang, Y. X. R.; Huang, H., Review on statistical methods for gene network reconstruction using expression data, Journal of Theoretical Biology, 362, 53-61, (2014) · Zbl 1307.92099
[2] Hecker, M.; Lambeck, S.; Toepfer, S.; van Someren, E.; Guthke, R., Gene regulatory network inference: data integration in dynamic models—a review, BioSystems, 96, 1, 86-103, (2009)
[3] Gómez-Vela, F.; Barranco, C. D.; Díaz-Díaz, N., Incorporating biological knowledge for construction of fuzzy networks of gene associations, Applied Soft Computing, 42, 144-155, (2016)
[4] Marbach, D.; Prill, R. J.; Schaffter, T.; Mattiussi, C.; Floreano, D.; Stolovitzky, G., Revealing strengths and weaknesses of methods for gene network inference, Proceedings of the National Acadamy of Sciences of the United States of America, 107, 14, 6286-6291, (2010)
[5] Lachmann, A.; Giorgi, F. M.; Lopez, G.; Califano, A., ARACNe-AP: gene network reverse engineering through adaptive partitioning inference of mutual information, Bioinformatics, 32, 14, 2233-2235, (2016)
[6] Omranian, N.; Eloundou-Mbebi, J. M. O.; Mueller-Roeber, B.; Nikoloski, Z., Gene regulatory network inference using fused LASSO on multiple data sets, Scientific Reports, 6, (2016)
[7] Petralia, F.; Wang, P.; Yang, J.; Tu, Z., Integrative random forest for gene regulatory network inference, Bioinformatics, 31, 12, i197-i205, (2015)
[8] Yu, H.; Jiao, B.; Lu, L.; Wang, P.; Chen, S.; Liang, C.; Liu, W.; Zheng, Y., NetMiner-an ensemble pipeline for building genome-wide and high-quality gene co-expression network using massive-scale RNA-seq samples, PLoS ONE, 13, 2, e0192613, (2018)
[9] Poehlman, W. L.; Rynge, M.; Balamurugan, D.; Mills, N.; Feltus, F. A., OSG-KINC: high-throughput gene co-expression network construction using the open science grid, Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
[10] Xia, J.; Gill, E. E.; Hancock, R. E. W., NetworkAnalyst for statistical, visual and network-based meta-analysis of gene expression data, Nature Protocols, 10, 6, 823-844, (2015)
[11] Barabási, A.; Oltvai, Z. N., Network biology: understanding the cell’s functional organization, Nature Reviews Genetics, 5, 2, 101-113, (2004)
[12] Vallabhajosyula, R. R.; Chakravarti, D.; Lutfeali, S.; Ray, A.; Raval, A., Identifying Hubs in protein interaction networks, PLoS ONE, 4, 4, (2009)
[13] Wang, Y.; Zhang, X.; Chen, L., Optimization meets systems biology, BMC Systems Biology, 4, Suppl 2, S1, (2010)
[14] Thomas, S. A.; Jin, Y., Reconstructing biological gene regulatory networks: where optimization meets big data, Evolutionary Intelligence, 7, 1, 29-47, (2014)
[15] Mendoza, M. R.; Bazzan, A. L., Evolving random boolean networks with genetic algorithms for regulatory networks reconstruction, Proceedings of the the 13th annual conference
[16] Liu, J.; Chi, Y.; Zhu, C., A dynamic multiagent genetic algorithm for gene regulatory network reconstruction based on fuzzy cognitive maps, IEEE Transactions on Fuzzy Systems, 24, 2, 419-431, (2016)
[17] Xiong, J.; Zhou, T., Gene regulatory network inference from multifactorial perturbation data using both regression and correlation analyses, PLoS ONE, 7, 9, (2012)
[18] Li, J.; Zhang, X.-S., An optimization model for gene regulatory network reconstruction with known biological information, Proceedings of the First International Symposium on Optimization and Systems Biology
[19] Studham, M. E.; Tjärnberg, A.; Nordling, T. E. M.; Nelander, S.; Sonnhammer, E. L. L., Functional association networks as priors for gene regulatory network inference, Bioinformatics, 30, 12, I130-I138, (2014)
[20] Lopes, F. M.; Martins, D. C.; Barrera, J.; Cesar, R. M., A feature selection technique for inference of graphs from their known topological properties: revealing scale-free gene regulatory networks, Information Sciences, 272, 1-15, (2014)
[21] Yang, B.; Xu, J.; Liu, B.; Wu, Z., Inferring gene regulatory networks with a scale-free property based informative prior, Proceedings of the 8th International Conference on BioMedical Engineering and Informatics (BMEI ’15)
[22] West, D. B., Introduction to Graph Theory, (2000), New Delhi, India: Prentice-Hall of India Private Limited, New Delhi, India
[23] Spellman, P. T.; Sherlock, G.; Zhang, M. Q.; Iyer, V. R.; Anders, K.; Eisen, M. B.; Brown, P. O.; Botstein, D.; Futcher, B., Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization, Molecular Biology of the Cell (MBoC), 9, 12, 3273-3297, (1998)
[24] Hodo, Y.; Honda, M.; Tanaka, A.; Nomura, Y.; Arai, K.; Yamashita, T.; Sakai, Y.; Yamashita, T.; Mizukoshi, E.; Sakai, A.; Sasaki, M.; Nakanuma, Y.; Moriyama, M.; Kaneko, S., Association of interleukin-28B genotype and hepatocellular carcinoma recurrence in patients with chronic hepatitis C, Clinical Cancer Research, 19, 7, 1827-1837, (2013)
[25] Jaskowiak, P. A.; Campello, R. J. G. B.; Costa, I. G., On the selection of appropriate distances for gene expression data clustering, BMC Bioinformatics, 15, article no. S2, (2014)
[26] Song, L.; Langfelder, P.; Horvath, S., Comparison of co-expression measures: mutual information, correlation, and model based indices, BMC Bioinformatics, 13, 1, article no. 328, (2012)
[27] Liu, H.; Liu, L.; Zhang, H., Ensemble gene selection for cancer classification, Pattern Recognition, 43, 8, 2763-2772, (2010)
[28] Farley, D. W.; Donaldson, S. L.; Comes, O.; Zuberi, K.; Badrawi, R.; Chao, P.; Franz, M.; Grouios, C.; Kazi, F.; Lopes, C. T.; Maitland, A.; Mostafavi, S.; Montojo, J.; Shao, Q.; Wright, G.; Bader, G. D.; Morris, Q., The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function, Nucleic Acids Research, 38, 2, W214-W220, (2010)
[29] Kim, H.; Shin, J.; Kim, E.; Kim, H.; Hwang, S.; Shim, J. E.; Lee, I., YeastNet v3: a public database of data-specific and integrated functional gene networks for Saccharomyces cerevisiae, Nucleic Acids Research, 42, 1, D731-D736, (2014)
[30] Cherry, J. M.; Hong, E. L.; Amundsen, C., Saccharomyces genome database: the genomics resource of budding yeast, Nucleic Acids Research, D700-D705, (2012)
[31] Lee, I.; Blom, U. M.; Wang, P. I.; Shim, J. E.; Marcotte, E. M., Prioritizing candidate disease genes by network-based boosting of genome-wide association data, Genome Research, 21, 7, 1109-1121, (2011)
[32] Dougherty, E. R., Validation of inference procedures for gene regulatory networks, Current Genomics, 8, 6, 351-359, (2007)
[33] Powers, D. M., Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation, International Journal of Machine Learning Technology, 2, 1, 37-63, (2011)
[34] Doncheva, N. T.; Assenov, Y.; Domingues, F. S.; Albrecht, M., Topological analysis and interactive visualization of biological networks and protein structures, Nature Protocols, 7, 4, 670-685, (2012)
[35] Pavlopoulos, G. A.; Secrier, M.; Moschopoulos, C. N.; Soldatos, T. G.; Kossida, S.; Aerts, J.; Schneider, R.; Bagos, P. G., Using graph theory to analyze biological networks, BioData Mining, 4, 1, article 10, (2011)
[36] Winterbach, W.; Mieghem, P. V.; Reinders, M.; Wang, H.; Ridder, D. D., Topology of molecular interaction networks, BMC Systems Biology, 7, article no. 90, (2013)
[37] Assenov, Y.; Ramírez, F.; Schelhorn, S.-E.; Lengauer, T.; Albrecht, M., Computing topological parameters of biological networks, Bioinformatics, 24, 2, 282-284, (2008)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.