×

Efficacy of function specific 3D-motifs in enzyme classification according to their EC-numbers. (English) Zbl 1411.92236

Summary: Due to the increasing number of protein structures with unknown function originated from structural genomics projects, protein function prediction has become an important subject in bioinformatics. Among diverse function prediction methods, exploring known 3D-motifs, which are associated with functional elements in unknown protein structures is one of the most biologically meaningful methods. Homologous enzymes inherit such motifs in their active sites from common ancestors. However, slight differences in the properties of these motifs, results in variation in the reactions and substrates of the enzymes. In this study, we examined the possibility of discriminating highly related active site patterns according to their EC-numbers by 3D-motifs. For each EC-number, the spatial arrangement of an active site, which has minimum average distance to other active sites with the same function, was selected as a representative 3D-motif. In order to characterize the motifs, various points in active site elements were tested. The results demonstrated the possibility of predicting full EC-number of enzymes by 3D-motifs. However, the discriminating power of 3D-motifs varies among different enzyme families and depends on selecting the appropriate points and features.

MSC:

92D20 Protein sequences, DNA sequences
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Almonacid, D. E.; Babbitt, P. C., Toward mechanistic classification of enzyme functions, Curr. Opin Chem. Biol., 15, 435-442 (2011)
[2] Barker, J. A.; Thornton, J. M., An algorithm for constraint-based structural template matching: application to 3D templates with statistical analysis, Bioinformatics, 19, 1644-1649 (2003)
[3] Berg, J. M.; Tymoczko, J. L.; Stryer, L., Biochemistry (2007), W.H. Freeman: W.H. Freeman New York
[4] Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E., The protein data bank, Nucleic Acids Res., 28, 235-242 (2000), (gkd090 [pii])
[5] Cai, Y. D.; Zhou, G. P.; Chou, K. C., Predicting enzyme family classes by hybridizing gene product composition and pseudo-amino acid composition, J. Theor. Biol., 234, 145-149 (2005) · Zbl 1445.92221
[6] Cao, J. Z.; Liu, W. Q.; Gu, H., Predicting viral protein subcellular localization with Chou’s pseudo amino acid composition and imbalance-weighted multi-label K-nearest neighbor algorithm, Protein Pept. Lett., 19, 1163-1169 (2012)
[7] Chen, W.; Feng, P. M.; Lin, H.; Chou, K. C., iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition, Nucleic Acids Res., 41, e68 (2013)
[8] Chen, W.; Lin, H.; Feng, P. M.; Ding, C.; Zuo, Y. C.; Chou, K. C., iNuc-PhysChem: a sequence-based predictor for identifying nucleosomes via physicochemical properties, PLoS ONE, 7, e47843 (2012)
[9] Chen, Y. K.; Li, K. B., Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou’s pseudo amino acid composition, J. Theor. Biol., 318, 1-12 (2013) · Zbl 1406.92450
[10] Chou, K.-C.; Shen, H.-B., Review: Recent advances in developing web-servers for predicting protein attributes, Nat. Sci., 1, 63-92 (2009)
[11] Chou, K. C., Prediction of protein cellular attributes using pseudo-amino acid composition, Proteins, 43, 246-255 (2001)
[12] Chou, K. C., Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes, Bioinformatics, 21, 10-19 (2005)
[13] Chou, K. C., Some remarks on protein attribute prediction and pseudo amino acid composition, J. Theor. Biol., 273, 236-247 (2011) · Zbl 1405.92212
[14] Chou, K. C., Some remarks on predicting multi-label attributes in molecular biosystems, Mol. Biosyst., 9, 1092-1100 (2013)
[15] Chou, K. C.; Elrod, D. W., Prediction of enzyme family classes, J. Proteome. Res., 2, 183-190 (2003)
[16] Chou, K. C.; Cai, Y. D., Using GO-PseAA predictor to predict enzyme sub-class, Biochem. Biophys. Res. Commun., 325, 506-509 (2004)
[17] Chou, K. C.; Cai, Y. D., Predicting enzyme family class in a hybridization space, Protein Sci., 13, 2857-2863 (2004)
[18] Chou, K. C.; Cai, Y. D., A novel approach to predict active sites of enzyme molecules, Proteins, 55, 77-82 (2004)
[19] Chou, K. C.; Wu, Z. C.; Xiao, X., iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites, Mol. Biosyst., 8, 629-641 (2012)
[20] Ding, H.; Luo, L.; Lin, H., Prediction of cell wall lytic enzymes using Chou’s amphiphilic pseudo amino acid composition, Protein Pept. Lett., 16, 351-355 (2009)
[21] Erdin, S.; Lisewski, A. M.; Lichtarge, O., Protein function prediction: towards integration of similarity metrics, Curr. Opinion Struct. Biol., 21, 180-188 (2011)
[22] Esmaeili, M.; Mohabatkar, H.; Mohsenzadeh, S., Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses, J. Theor. Biol., 263, 203-209 (2010) · Zbl 1406.92455
[23] Gerlt, J. A.; Allen, K. N.; Almo, S. C.; Armstrong, R. N.; Babbitt, P. C.; Cronan, J. E.; Dunaway-Mariano, D.; Imker, H. J.; Jacobson, M. P.; Minor, W.; Poulter, C. D.; Raushel, F. M.; Sali, A.; Shoichet, B. K.; Sweedler, J. V., The enzyme function initiative, Biochemistry, 50, 9950-9962 (2011)
[24] Guo, J.; Rao, N.; Liu, G.; Yang, Y.; Wang, G., Predicting protein folding rates using the concept of Chou’s pseudo amino acid composition, J. Comput. Chem., 32, 1612-1617 (2011)
[25] Huang, C.; Yuan, J. Q., A multilabel model based on Chou’s pseudo-amino acid composition for identifying membrane proteins with both single and multiple functional types, J. Membr. Biol., 246, 327-334 (2013)
[26] Jambon, M.; Imberty, A.; Deleage, G.; Geourjon, C., A new bioinformatic approach to detect common 3D sites in protein structures, Proteins-Struct. Funct. Genet., 52, 137-145 (2003)
[27] Jiang, Y.; Huang, T.; Chen, L.; Gao, Y. F.; Cai, Y.; Chou, K. C., Signal propagation in protein interaction network during colorectal cancer progression, Biomed. Res. Int., 2013, 287019 (2013)
[28] Laskowski, R. A., PDBsum: summaries and analyses of PDB structures, Nucleic Acids Res., 29, 221-222 (2001)
[29] Laskowski, R. A., Protein Structure Databases, Mol. Biotechnol., 48, 183-198 (2011)
[30] Laskowski, R.A., Watson, J.D., Thornton, J.M., 2005. Protein function prediction using local 3D templates. J. Mol. Biol. 351, 614-626, 10.1016/j.jmb.2005.05.067.; Laskowski, R.A., Watson, J.D., Thornton, J.M., 2005. Protein function prediction using local 3D templates. J. Mol. Biol. 351, 614-626, 10.1016/j.jmb.2005.05.067.
[31] Li, G. H.; Huang, J. F., CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation, BMC Bioinformatics, 11 (2010), 43910.1186/1471-2105-11-439
[32] Lin, H., The modified Mahalanobis Discriminant for predicting outer membrane proteins by using Chou’s pseudo amino acid composition, J. Theor. Biol., 252, 350-356 (2008) · Zbl 1398.92076
[33] Lin, W. Z.; Fang, J. A.; Xiao, X.; Chou, K. C., iLoc-animal: a multi-label learning classifier for predicting subcellular localization of animal proteins, Mol. Biosyst., 9, 634-644 (2013)
[34] Martin, A. C.R., PDBSprotEC: a web-accessible database linking PDB chains to EC numbers via SwissProt, Bioinformatics, 20, 986-988 (2004)
[35] Meng, E. C.; Polacco, B. J.; Babbitt, P. C., 3D Motifs (2009), Springer: Springer Netherlands
[36] Mohabatkar, H., Prediction of cyclin proteins using Chou’s pseudo amino acid composition, Protein Pept. Lett., 17, 1207-1214 (2010)
[37] Mohabatkar, H.; Beigi, M. M.; Abdolahi, K.; Mohsenzadeh, S., Prediction of allergenic proteins by means of the concept of Chou’s pseudo amino acid composition and a machine learning approach, Med. Chem., 9, 133-137 (2013)
[38] Moll, M.; Bryant, D. H.; Kavraki, L. E., The LabelHash algorithm for substructure matching, BMC Bioinformatics, 11, 555 (2010), 10.1186/1471-2105-11-555
[39] Montelione, G. T., The protein structure initiative: achievements and visions for the future, F1000 Biol. Rep., 4, 7 (2012), (10.3410/B4-77 [pii].)
[40] Nebel, J. C.; Herzyk, P.; Gilbert, D. R., Automatic generation of 3D motifs for classification of protein binding sites, BMC Bioinformatics, 8 (2007)
[41] Porter, C. T.; Bartlett, G. J.; Thornton, J. M., The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data, Nucleic Acids Res., 32, D129-D133 (2004)
[42] Punta, M.; Ofran, Y., The rough guide to in silico function prediction, or how to use sequence and structure information to predict protein function, PLoS Comput. Biol., 4, e1000160 (2008)
[43] Qiu, J. D.; Huang, J. H.; Shi, S. P.; Liang, R. P., Using the concept of Chou’s pseudo amino acid composition to predict enzyme family classes: an approach with support vector machine based on discrete wavelet transform, Protein Pept. Lett., 17, 715-722 (2010)
[44] Qiu, Z.; Wang, X., Prediction of protein-protein interaction sites using patch-based residue characterization, J. Theor. Biol., 293, 143-150 (2012) · Zbl 1307.92088
[45] Rost, B., Enzyme function less conserved than anticipated, J. Mol. Biol., 318, 595-608 (2002)
[46] Sadowski, M. I.; Jones, D. T., The sequence-structure relationship and protein function prediction, Curr. Opinion Struct. Biol., 19, 357-362 (2009), 10.1016/j.sbi.2009.03.008 (S0959-440X(09)00043-8 [pii])
[47] Sahu, S. S.; Panda, G., A novel feature representation method based on Chou’s pseudo amino acid composition for protein structural class prediction, Comput. Biol. Chem., 34, 320-327 (2010) · Zbl 1403.92221
[48] Schrodinger, L.L.C., 2010. The PyMOL Molecular Graphics System. Version 1.3r1.; Schrodinger, L.L.C., 2010. The PyMOL Molecular Graphics System. Version 1.3r1.
[49] Shen, H. B.; Chou, K. C., EzyPred: a top-down approach for predicting enzyme functional classes and subclasses, Biochem. Biophys. Res. Commun., 364, 53-59 (2007)
[50] Shi, S. P.; Qiu, J. D.; Sun, X. Y.; Suo, S. B.; Huang, S. Y.; Liang, R. P., A method to distinguish between lysine acetylation and lysine methylation from protein sequences, J. Theor. Biol., 310, 223-230 (2012) · Zbl 1337.92162
[51] Torrance, J. W.; Bartlett, G. J.; Porter, C. T.; Thornton, J. M., Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families, J. Mol. Biol., 347, 565-581 (2005)
[52] Via, A.; Ferre, F.; Brannetti, B.; Helmer-Citterich, M., Protein surface similarities: a survey of methods to describe and compare protein surfaces, Cell. Mol. Life Sci., 57, 1970-1977 (2000)
[53] Von Grotthuss, M.; Plewczynski, D.; Ginalski, K.; Rychlewski, L.; Shakhnovich, E., PDB-UF: database of predicted enzymatic functions for unannotated protein structures from structural genomics, BMC Bioinformatics, 7, 53-62 (2006)
[54] Wang, T.; Xia, T.; Hu, X. M., Geometry preserving projections algorithm for predicting membrane protein types, J. Theor. Biol., 262, 208-213 (2010) · Zbl 1403.92225
[55] Wang, Y.-C.; Wang, Y.; Yang, Z.-X.; Deng, N.-Y., Support vector machine prediction of enzyme function with conjoint triad feature and hierarchical context, BMC Syst. Biol., 5, S6 (2011)
[56] Webb, E.C., 1992. Enzyme nomenclature 1992: recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the nomenclature and classification of enzymes. Published for the International Union of Biochemistry and Molecular Biology by Academic Press. San Diego.; Webb, E.C., 1992. Enzyme nomenclature 1992: recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the nomenclature and classification of enzymes. Published for the International Union of Biochemistry and Molecular Biology by Academic Press. San Diego.
[57] Whisstock, J. C.; Lesk, A. M., Prediction of protein function from protein sequence and structure, Q. Rev. Biophys., 36, 307-340 (2003)
[58] Xiao, X.; Wang, P.; Lin, W. Z.; Jia, J. H.; Chou, K. C., iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types, Anal. Biochem., 436, 168-177 (2013)
[59] Xu, Y.; Ding, J.; Wu, L. Y.; Chou, K. C., iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition, PLoS ONE, 8, e55844 (2013)
[60] Yahalom, R.; Reshef, D.; Wiener, A.; Frankel, S.; Kalisman, N.; Lerner, B.; Keasar, C., Structure-based identification of catalytic residues, Proteins-Struct. Funct. Bioinformatics, 79, 1952-1963 (2011)
[61] Yoon, S.; Ebert, J. C.; Chung, E. Y.; De Micheli, G.; Altman, R. B., Clustering protein environments for function prediction: Finding PROSITE motifs in 3D, BMC Bioinformatics, 8 (2007)
[62] Zakeri, P.; Moshiri, B.; Sadeghi, M., Prediction of protein submitochondria locations based on data fusion of various features of sequences, J. Theor. Biol., 269, 208-216 (2011) · Zbl 1307.92094
[63] Zhang, S. W.; Zhang, Y. L.; Yang, H. F.; Zhao, C. H.; Pan, Q., Using the concept of Chou’s pseudo amino acid composition to predict protein subcellular localization: an approach by incorporating evolutionary information and von Neumann entropies, Amino Acids, 34, 565-572 (2008)
[64] Zhou, X. B.; Chen, C.; Li, Z. C.; Zou, X. Y., Using Chou’s amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes, J. Theor. Biol., 248, 546-551 (2007) · Zbl 1451.92245
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.