×

The correlation of gene expression and co-regulated gene patterns in characteristic KEGG pathways. (English) Zbl 1407.92048

Summary: There is great interest in chromosome- and pathway-based techniques for genomics data analysis in the current work in order to understand the mechanism of disease. However, there are few studies addressing the abilities of machine learning methods in incorporating pathway information for analyzing microarray data. In this paper, we identified the characteristic pathways by combining the classification error rates of out-of-bag (OOB) in random forests with pathways information. At each characteristic pathway, the correlation of gene expression was studied and the co-regulated gene patterns in different biological conditions were mined by mining attribute profile (MAP) algorithm. The discovered co-regulated gene patterns were clustered by the average-linkage hierarchical clustering technique. The results showed that the expression of genes at the same characteristic pathway were approximate. Furthermore, two characteristic pathways were discovered to present co-regulated gene patterns in which one contained 108 patterns and the other contained one pattern. The results of cluster analysis showed that the smallest similarity coefficient of clusters was more than 0.623, which indicated that the co-regulated patterns in different biological conditions were more approximate at the same characteristic pathway. The methods discussed in this paper can provide additional insight into the study of microarray data.

MSC:

92C40 Biochemistry, molecular biology
68T10 Pattern recognition, speech recognition

Software:

flexclust; KEGG
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Appella, E.; C. W, A., Post-translational modifications and activation of p53 by genotoxic stresses, Eur. J. Biochem., 268, 2764-2772 (2001)
[2] Bolick, D. T., 12/15-lipoxygenase regulates intercellular adhesion molecule-1 expression and monocyte adhesion to endothelium through activation of RhoA and nuclear factor-kB, Arterioscler. Thromb. Vasc. Biol., 25, 2301 (2005)
[3] Bréhélin, L.; Florent, I.; Gascuel, O.; Maréchal, É., Assessing functional annotation transfers with inter-species conserved coexpression: application to Plasmodium falciparum, BMC Genom., 11, 35 (2010)
[4] Breiman, L., Manual on setting up, using, and understanding, Random Forests, 4.0 (2003)
[5] Egmont-Petersen, M.; de Ridder, D.; Handels, H., Image processing with neural networks—a review, Pattern Recognition, 35, 10, 2279-2301 (2002) · Zbl 1006.68884
[6] Eisen, M. B., Cluster analysis and display of genome-wide expression patterns, Proc. Natl. Acad. Sci. USA, 95, 14863-14868 (1998)
[7] Furey, T. S., Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics, 16, 906-914 (2000)
[8] Gasch, Genomic expression programs in the response of yeast cells to environmental changes, Mol. Biol. Cell, 11, 12, 4241-4257 (2000)
[9] Gutin, G.; Yeo, A.; Zverovich, A., Traveling salesman should not be greedy: domination analysis of greedy-type heuristics for the TSP, Discrete Appl. Math., 117, 81-86 (2002) · Zbl 1004.68121
[10] Gyenesei, A., Frequent pattern discovery without binarization:mining attribute profiles, PKDD Lect. Notes Artif. Intell., 4213, 528-535 (2006)
[11] Gyenesei, A., Mining co-regulated gene profiles for the detection of functional associations in gene expression data, Bioinformatics, 23, 1927-1935 (2007)
[12] Hanisch, D.; Zien, A.; Zimmer, R.; Lengauer, T., Co-clustering of biological networks and gene expression data, Bioinformatics, 18, S145-S154 (2002)
[13] Jeong, H.; Tombor, B.; Albert, R.; Oltvai, Z. N.; Barabasi, A. L., The largescale organization of metabolic networks, Nature, 407, 651-654 (2000)
[14] Ishwaran, H,; Rao, J. S., Detecting differentially expressed genes in microarrays using Bayesian model selection, J. Am. Stat. Assoc., 98, 438-455 (2003) · Zbl 1041.62090
[15] Iida, T., HIF-1-induced apoptosis of endothelial cells, Genes Cells, 7, 143-149 (2002)
[16] Ihmels, J.; Levy, R.; Barkai, N.., Principles of transcriptional control in the metabolic network of Saccharomyces cerevisiae, Nat. Biotechnol., 22, 86-92 (2003)
[17] Jakubowski, H., Translational accuracy of aminoacyl-tRNA synthetases: implications for atherosclerosis, Am. Soc. Nutr. Sci., 131, 2983S-2987S (2001)
[18] Kam, H. T., A data complexity analysis of comparative advantages of decision forest constructors, Pattern Anal. Appl., 5, 102-112 (2002) · Zbl 1002.68715
[19] Kanehisa, M., The KEGG databases at GenomeNet, Nucleic Acids Res., 30, 42-46 (2002)
[20] Kanehisa, M.; Goto, S.; Hattori, M.; Aoki-Kinoshita, K. F.; Itoh, M.; Kawashima, S.; Katayama, T.; Araki, M.; Hirakawa, M., From genomics to chemical genomics: new developments in KEGG, Nucleic Acids Res., 34, Database issue, D354-D357 (2006)
[21] Kerachian, M.; Cournoyer, D.; Chow, T. Y.K.; Harvey, E. J.; Séguin, C., Procoagulant effect of dexamethasone in human umbilical vein endothelial cells: a potential mechanism of glucocorticoid-induced osteonecrosis, J. Thromb. Haemost., 5, 364-372 (2007)
[22] Kharchenko, P., Church, G.M., and Vitkup, D., 2005. Expression dynamics of a cellular metabolic network. Mol. Systems Biol.msb4100023-E1-msb4100023-E6.; Kharchenko, P., Church, G.M., and Vitkup, D., 2005. Expression dynamics of a cellular metabolic network. Mol. Systems Biol.msb4100023-E1-msb4100023-E6.
[23] Leisch, F., A toolbox for k-centroids cluster analysis, Comput.Stat.Data Anal., 51, 526-544 (2006) · Zbl 1157.62439
[24] Liberati, C.; Howe, J. A.; Bozdogan, H., Data adaptive simultaneous parameter and kernel selection in kernel discriminant analysis using information complexity, J. Pattern Recog. Res., 4, 189-198 (2009)
[25] Neeman, G.; Blanaru, M.; Bloch, B.; Kremer, I.; Ermilov, M.; Javitt, D. C.; Heresco-Levy, U., Relation of plasma glycine, serine, and homocysteine levels to schizophrenia symptoms and medication type, Am. Psychiatr. Assoc., 162, 1738-1740 (2005)
[26] Oti, M.; Reeuwijk, Jv; Huynen, M. A.; Brunner, H. G., Conserved co-expression for candidate disease gene prioritization, BMC Bioinf., 9, 147-159 (2008)
[27] Pang, H.; Lin, A.; Holford, M.; Enerson, B. E.; Lu, B.; Lawton, M. P.; Floyd, E.; Zhao, H., Pathway analysis using random forests classification and regression, Bioinformatics, 22, 2028-2036 (2006)
[28] Ravasz, E.; Somera, A. L.; Mongru, D. A.; Oltvai, Z. N.; Barabási, A.-L., Hierarchical organization of modularity in metabolic networks, Science, 297, 1551-1555 (2002)
[29] Sen, L., 2010. Apparatus and method for in vivo intracellular transfection of gene, SIRNA, SHRNA vectors, and other biomedical diagnostic and therapeutic drugs and molecules for the treatment of arthritis and other orthopedic diseases in large animals and humans. United States Patent Application 20100004584.; Sen, L., 2010. Apparatus and method for in vivo intracellular transfection of gene, SIRNA, SHRNA vectors, and other biomedical diagnostic and therapeutic drugs and molecules for the treatment of arthritis and other orthopedic diseases in large animals and humans. United States Patent Application 20100004584.
[30] Taylor, J.; Tibshirani, R.; Efron, B., The “Miss rate” for the analysis of gene expression data, Biostatistics, 6, 1, 111-117 (2005) · Zbl 1069.62104
[31] Tusher, V. G.; Tibshirani, R.; Chu, G., Significance analysis of microarrays applied to the ionizing radiation response, PNAS, 98, 5116-5121 (2001) · Zbl 1012.92014
[32] Wang, H.; Wang, Q.; Li, X.; Shen, B.; Ding, M.; Shen, Z., Towards patterns tree of gene coexpression in eukaryotic species, Bioinformatics, 24, 1367-1373 (2008)
[33] Welch, W. J., Construction of permutation tests, J. Am. Stat. Assoc., 85, 693-698 (1990)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.