×

A network-based feature selection approach to identify metabolic signatures in disease. (English) Zbl 1337.92080

Summary: The identification and interpretation of metabolic biomarkers is a challenging task. In this context, network-based approaches have become increasingly a key technology in systems biology allowing to capture complex interactions in biological systems. In this work, we introduce a novel network-based method to identify highly predictive biomarker candidates for disease. First, we infer two different types of networks: (i) correlation networks, and (ii) a new type of network called ratio networks. Based on these networks, we introduce scores to prioritize features using topological descriptors of the vertices. To evaluate our method we use an example dataset where quantitative targeted MS/MS analysis was applied to a total of 52 blood samples from 22 persons with obesity \((\mathrm{BMI}>30)\) and 30 healthy controls. Using our network-based feature selection approach we identified highly discriminating metabolites for obesity (F-score \(>0.85\), accuracy \(> 85\%\)), some of which could be verified by the literature.

MSC:

92C42 Systems biology, networks
92C50 Medical applications (general)

Software:

graph; caret; igraph; R; QuACN
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Alberts, B.; Johnson, A.; Lewis, J.; Raff, M.; Roberts, K.; Walter, P., Molecular Biology of the Cell (2007), Garland Science
[2] Allen, T. L.; Matthews, V. B.; Febbraio, M. A., Overcoming insulin resistance with ciliary neurotrophic factor, Handb. Exp. Pharmacol., 203, 179-199 (2011)
[3] Altman, D. G., Practical Statistics for Medical Research (1991), Chapman & Hall
[4] Ambroise, C.; McLachlan, G., Selection bias in gene extraction on the basis of microarray gene-expression data, Proc. Natl. Acad. Sci. U.S.A., 99, 10, 6562 (2002) · Zbl 1034.92013
[5] Baumgartner, C.; Graber, A., Successes and New Directions in Data Mining, Data Mining and Knowledge Discovery in Metabolomics (2007), Idea Group Inc., pp. 141-166 (Chapter 7)
[6] Bergmann, S.; Ihmels, J.; Barkai, N., Similarities and differences in genome-wide expression data of six organisms, PLoS Biol., 2, 1, E9 (2004)
[7] Boyer, F.; Morgat, A.; Labarre, L.; Pothier, J.; Viari, A., Syntons, metabolons and interactons: an exact graph-theoretical approach for exploring neighbourhood between genomic and functional data, Bioinformatics, 21, 23, 4209-4215 (2005)
[8] Cline, M. S.; Smoot, M.; Cerami, E.; Kuchinsky, A.; Landys, N.; Workman, C.; Christmas, R.; Avila-Campilo, I.; Creech, M.; Gross, B.; Hanspers, K.; Isserlin, R.; Kelley, R.; Killcoyne, S.; Lotia, S.; Maere, S.; Morris, J.; Ono, K.; Pavlovic, V.; Pico, A. R.; Vailaya, A.; Wang, P.-L.; Adler, A.; Conklin, B. R.; Hood, L.; Kuiper, M.; Sander, C.; Schmulevich, I.; Schwikowski, B.; Warner, G. J.; Ideker, T.; Bader, G. D., Integration of biological networks and gene expression data using cytoscape, Nat. Protocols, 2, 10, 2366-2382 (2007)
[9] Cortes, C.; Vapnik, V., Support-vector networks, Mach. Learn., 20, 3, 273-297 (1995) · Zbl 0831.68098
[10] Csardi, G.; Nepusz, T., The igraph software package for complex network research, InterJ. Complex Syst., 1695 (2006)
[11] Dehmer, M.; Mowshowitz, A., A history of graph entropy measures, Inf. Sci., 1, 57-78 (2011) · Zbl 1204.94050
[12] Dehmer, M., Barbarini, N., Varmuza, K., Graber, A., 2010. Novel topological descriptors for analyzing biological networks. BMC Struct. Biol. 10 (18).; Dehmer, M., Barbarini, N., Varmuza, K., Graber, A., 2010. Novel topological descriptors for analyzing biological networks. BMC Struct. Biol. 10 (18).
[13] Diestel, R., Graph Theory (2005), Springer-Verlag · Zbl 1074.05001
[14] Emmert-Streib, F.; Dehmer, M., Networks for systems biology: conceptual connection of data and function, IET Syst. Biol., 5, 3, 185-207 (2011)
[15] Fell, D. A.; Wagner, A., The small world of metabolism, Nat. Biotechnol., 18, 11, 1121-1122 (2000)
[16] Fukushima, A.; Kusano, M.; Redestig, H.; Arita, M.; Saito, K., Metabolomic correlation-network modules in arabidopsis based on a graph-clustering approach, BMC Syst. Biol., 5, 1 (2011)
[17] Gentleman, R., Whalen, E., Huber, W., Falcon, S., 2010. Graph: A Package to Handle Graph Data Structures, R Package Version \(1.28.0 \langle\) http://CRAN.R-project.org/package=graph \(\rangle \); Gentleman, R., Whalen, E., Huber, W., Falcon, S., 2010. Graph: A Package to Handle Graph Data Structures, R Package Version \(1.28.0 \langle\) http://CRAN.R-project.org/package=graph \(\rangle \)
[18] Hastie, T.; Tibshirani, R.; Friedman, J. H., The Elements of Statistical Learning (2001), Springer: Springer Berlin, New York
[19] He, H.; Garcia, E. A., Learning from imbalanced data, IEEE Trans. Knowledge Data Eng., 21, 9, 1263-1284 (2009)
[20] Idle, J.; Gonzalez, F., Metabolomics, Cell Metab., 6, 348-351 (2007)
[21] Jeong, H.; Tombor, B.; Albert, R.; Oltvai, Z. N.; Barabsi, A. L., The large-scale organization of metabolic networks, Nature, 407, 6804, 651-654 (2000)
[22] John, G.H., Kohavi, R., Pfleger, K., 1994. Irrelevant features and the subset selection problem. In: Proceedings of the 11th International Conference on Machine Learning.; John, G.H., Kohavi, R., Pfleger, K., 1994. Irrelevant features and the subset selection problem. In: Proceedings of the 11th International Conference on Machine Learning.
[23] Junker, B. H.; Koschützki, D.; Schreiber, F., Exploration of biological network centralities with centibin, BMC Bioinformatics, 7, 219 (2006)
[24] Kohavi, R.; John, G. H., The wrapper approach, (Liu, H.; Motoda, H., Feature Selection for Knowledge Discovery and Data Mining (1998), Kluwer Academic Publishers), 33-50
[25] Konstantinova, E. V.; Vidyuk, M. V., Discriminating tests of information and topological indices. Animals and trees, J. Chem. Inf. Comput. Sci., 43, 6, 1860-1871 (2003)
[26] Koschützki, D.; Schwöbbermeyer, H.; Schreiber, F., Ranking of network elements based on functional substructures, J. Theor. Biol., 248, 3, 471-479 (2007) · Zbl 1451.92147
[27] Kuhn, M. contributions from Jed Wing, Weston, S., Williams, A., Keefer, C., Engelhardt, A., 2011. Caret: Classification and Regression Training, R Package Version \(4.91 \langle\) http://CRAN.R-project.org/package=caret \(\rangle \); Kuhn, M. contributions from Jed Wing, Weston, S., Williams, A., Keefer, C., Engelhardt, A., 2011. Caret: Classification and Regression Training, R Package Version \(4.91 \langle\) http://CRAN.R-project.org/package=caret \(\rangle \)
[28] Li, Z.-Y.; Zheng, X.-Y.; Gao, X.-X.; Zhou, Y.-Z.; Sun, H.-F.; Zhang, L.-Z.; Guo, X.-Q.; Du, G.-H.; Qin, X.-M., Study of plasma metabolic profiling and biomarkers of chronic unpredictable mild stress rats based on gas chromatography/mass spectrometry, Rapid Commun. Mass Spectrom., 24, 24, 3539-3546 (2010)
[29] Masaki, T.; Yoshimatsu, H., Neuronal histamine and its receptors in obesity and diabetes, Curr. Diabetes Rev., 3, 3, 212-216 (2007)
[30] Moroz, J.; Turner, J.; Slupsky, C.; Fallone, G.; Syme, A., Tumour xenograft detection through quantitative analysis of the metabolic profile of urine in mice, Phys. Med. Biol., 56, 3, 535-556 (2011)
[31] Morris, S. M., Enzymes of arginine metabolism, J. Nutr., 134, 10 Suppl, 2743S-2747S (2004), (Discussion 2765S-2767S)
[32] Müller, L. A.J.; Kugler, K. G.; Netzer, M.; Graber, A.; Dehmer, M., A network-based approach to classify the three domains of life, Biol. Direct., 6, 53 (2011)
[33] Müller, L. A.J.; Kugler, K. G.; Dander, A.; Graber, A.; Dehmer, M., QuACN: an R package for analyzing complex biological networks quantitatively, Bioinformatics, 27, 1, 140-141 (2011)
[34] Netzer, M.; Millonig, G.; Osl, M.; Pfeifer, B.; Praun, S.; Villinger, J.; Vogel, W.; Baumgartner, C., A new ensemble-based algorithm for identifying breath gas marker candidates in liver disease using ion molecule reaction mass spectrometry, Bioinformatics, 25, 7, 941-947 (2009)
[35] Netzer, M.; Weinberger, K. M.; Handler, M.; Seger, M.; Fang, X.; Kugler, K. G.; Graber, A.; Baumgartner, C., Profiling the human response to physical exercise: a computational strategy for the identification and kinetic analysis of metabolic biomarkers, J. Clin. Bioinf., 1, 1, 34 (2011)
[36] O’Quinn, P. R.; Knabe, D. A.; Wu, G., Arginine catabolism in lactating porcine mammary tissue, J. Anim. Sci., 80, 2, 467-474 (2002)
[37] Osl, M.; Dreiseitl, S.; Pfeifer, B.; Weinberger, K.; Klocker, H.; Bartsch, G.; Schäfer, G.; Tilg, B.; Graber, A.; Baumgartner, C., A new rule-based algorithm for identifying metabolic markers in prostate cancer using tandem mass spectrometry, Bioinformatics, 24, 24, 2908-2914 (2008)
[38] Osl, M.; Dreiseitl, S.; Cerqueira, F.; Netzer, M.; Pfeifer, B.; Baumgartner, C., Demoting redundant features to improve the discriminatory ability in cancer data, J. Biomed. Inf., 42, 4, 721-725 (2009)
[39] Pan, W., A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments, Bioinformatics, 18, 4, 546-554 (2002)
[40] Pavlopoulos, G. A.; Wegener, A.-L.; Schneider, R., A survey of visualization tools for biological network analysis, BioData Min., 1, 12 (2008)
[41] R Development Core Team, 2011. R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN \(3-900051-07-0 \langle\) http://www.R-project.org \(\rangle \); R Development Core Team, 2011. R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN \(3-900051-07-0 \langle\) http://www.R-project.org \(\rangle \)
[42] Roberts, M. J.; Schirra, H. J.; Lavin, M. F.; Gardiner, R. A., Metabolomics: a novel approach to early and noninvasive prostate cancer detection, Korean J. Urol., 52, 2, 79-89 (2011)
[43] Saeys, Y.; Inza, I.; Larraaga, P., A review of feature selection techniques in bioinformatics, Bioinformatics, 23, 2507-2517 (2007)
[44] Schirmer, M. D.; Harper, A. E., Adaptive responses of mammalian histidine-degrading enzymes, J. Biol. Chem., 245, 5, 1204-1211 (1970)
[45] Silventoinen, K.; Sans, S.; Tolonen, H.; Monterde, D.; Kuulasmaa, K.; Kesteloot, H.; Tuomilehto, J.; Project, W. H.O. M., Trends in obesity and energy supply in the who Monica project, Int. J. Obes. Relat. Metab. Disord., 28, 5, 710-718 (2004)
[46] Skorobogatov, A.; Dobrynin, A., Metric analysis of graphs, Commun. Math. Comput. Chem., 23, 105-151 (1988) · Zbl 0666.05065
[47] Stifel, F. B.; Herman, R. H., Histidine metabolism, Am. J. Clin. Nutr., 24, 2, 207-217 (1971)
[48] Sugino, T.; Shirai, T.; Kajimoto, Y.; Kajimoto, O., L-ornithine supplementation attenuates physical fatigue in healthy volunteers by modulating lipid and amino acid metabolism, Nutr. Res., 28, 11, 738-743 (2008)
[49] Tai, E. S.; Tan, M. L.S.; Stevens, R. D.; Low, Y. L.; Muehlbauer, M. J.; Goh, D. L.M.; Ilkayeva, O. R.; Wenner, B. R.; Bain, J. R.; Lee, J. J.M.; Lim, S. C.; Khoo, C. M.; Shah, S. H.; Newgard, C. B., Insulin resistance is associated with a metabolic profile of altered protein metabolism in Chinese and Asian-Indian men, Diabetologia, 53, 4, 757-767 (2010)
[50] Todeschini, R.; Consonni, V., Molecular Descriptors for Chemoinformatics (2009), Vch Pub
[51] Todeschini, R.; Consonni, V.; Mannhold, R., Handbook of Molecular Descriptors (2002), Wiley-VCH: Wiley-VCH Germany
[52] Walter, M.; Kottke, T.; Stark, H., The histamine h4 receptor: targeting inflammatory disorders, Eur. J. Pharmacol., 668, 1-2, 1-5 (2011)
[53] Wang, T. J.; Larson, M. G.; Vasan, R. S.; Cheng, S.; Rhee, E. P.; McCabe, E.; Lewis, G. D.; Fox, C. S.; Jacques, P. F.; Fernandez, C.; O’Donnell, C. J.; Carr, S. A.; Mootha, V. K.; Florez, J. C.; Souza, A.; Melander, O.; Clish, C. B.; Gerszten, R. E., Metabolite profiles and the risk of developing diabetes, Nat. Med., 17, 448-453 (2011)
[54] Weinberger, K. M., Metabolomics in diagnosing metabolic diseases, Ther. Umsch., 65, 9, 487-491 (2008)
[55] Wu, H.; Xue, R.; Dong, L.; Liu, T.; Deng, C.; Zeng, H.; Shen, X., Metabolomic profiling of human urine in hepatocellular carcinoma patients using gas chromatography/mass spectrometry, Anal. Chim. Acta, 648, 1, 98-104 (2009)
[56] Zhang, Y.; Guo, K.; LeBlanc, R. E.; Loh, D.; Schwartz, G. J.; Yu, Y.-H., Increasing dietary leucine intake reduces diet-induced obesity and improves glucose and cholesterol metabolism in mice via multimechanisms, Diabetes, 56, 6, 1647-1654 (2007)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.