zbMATH — the first resource for mathematics

Prediction of interface residue based on the features of residue interaction network. (English) Zbl 1393.92011
Summary: Protein-protein interaction plays a crucial role in the cellular biological processes. Interface prediction can improve our understanding of the molecular mechanisms of the related processes and functions. In this work, we propose a classification method to recognize the interface residue based on the features of a weighted residue interaction network. The random forest algorithm is used for the prediction and 16 network parameters and the B-factor are acting as the element of the input feature vector. Compared with other similar work, the method is feasible and effective. The relative importance of these features also be analyzed to identify the key feature for the prediction. Some biological meaning of the important feature is explained. The results of this work can be used for the related work about the structure-function relationship analysis via a residue interaction network model.
92C40 Biochemistry, molecular biology
92-08 Computational methods for problems pertaining to biology
Full Text: DOI
[1] Bendell, C. J.; Liu, S.; Aumentado-Armstrong, T.; Istrate, B.; Cernek, P. T.; Khan, S.; Picioreanu, S.; Zhao, M.; Murgita, R. A., Transient protein-protein interface prediction: datasets, features, algorithms, and the RAD-T predictor, BMC Bioinf., 15, 82, (2014)
[2] Breiman, L., Random forests, Mach. Learn., 45, 5-32, (2001) · Zbl 1007.68152
[3] Brender, J. R.; Zhang, Y., Predicting the effect of mutations on protein-protein binding interactions through structure-based interface profiles, PLoS Comput. Biol., 11, (2015)
[4] Brinda, K. V.; Vishveshwara, S., Oligomeric protein structure networks: insights into protein-protein interactions, BMC Bioinf., 6, 296, (2005)
[5] Chen, P.; Li, J., Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information, BMC Bioinf., 11, 402, (2010)
[6] Chen, W.; Feng, P.; Yang, H.; Ding, H.; Lin, H.; Chou, K.-C., Irna-AI: identifying the adenosine to inosine editing sites in RNA sequences, Oncotarget, 8, 4208-4217, (2017)
[7] Cheng, W.; Yan, C., A graph approach to mining biological patterns in the binding interfaces, J. Comput. Biol., 24, 31-39, (2017)
[8] Cheng, X.; Xiao, X.; Chou, K.-C., Ploc-mplant: predict subcellular localization of multi-location plant proteins by incorporating the optimal GO information into general pseaac, Mol. BioSyst., (2017)
[9] Cheng, X.; Zhao, S.-G.; Xiao, X.; Chou, K.-C., Iatc-misf: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals, Bioinformatics, 33, 341-346, (2017)
[10] Chou, K.-C., A key driving force in determination of protein structural classes, Biochem. Biophys. Res. Commun., 264, 216-224, (1999)
[11] Chou, K.-C., Using subsite coupling to predict signal peptides, Protein Eng., 14, 75-79, (2001)
[12] Chou, K.-C., Some remarks on protein attribute prediction and pseudo amino acid composition, J. Theor. Biol., 273, 236-247, (2011) · Zbl 1405.92212
[13] Chou, K.-C., Some remarks on predicting multi-label attributes in molecular biosystems, Mol. Biosyst., 9, 1092-1100, (2013)
[14] Chou, K.-C., Impacts of bioinformatics to medicinal chemistry, Med. Chem., 11, 218-234, (2015)
[15] Chou, K.-C.; Cai, Y.-D., Predicting protein-protein interactions from sequences in a hybridization space, J. Proteome Res., 5, 316-322, (2006)
[16] Chou, K.-C.; Shen, H.-B., Recent advances in developing web-servers for predicting protein attributes, Nat. Sci., 1, 63-92, (2009)
[17] del Sol, A.; O’Meara, P., Small-world network approach to identify key residues in protein-protein interaction, Proteins, 58, 672-682, (2005)
[18] Dong, Q.; Wang, X.; Lin, L.; Guan, Y., Exploiting residue-level and profile-level interface propensities for usage in binding sites prediction of proteins, BMC Bioinf., 8, 147, (2007)
[19] Esmaielbeiki, R.; Krawczyk, K.; Knapp, B.; Nebel, J.-C.; Deane, C. M., Progress and challenges in predicting protein interfaces, Briefings Bioinf., 17, 117-131, (2016)
[20] Ezkurdia, I.; Bartoli, L.; Fariselli, P.; Casadio, R.; Valencia, A.; Tress, M. L., Progress and challenges in predicting protein-protein interaction sites, Briefings Bioinf., 10, 233-246, (2009)
[21] Feng, P.; Ding, H.; Yang, H.; Chen, W.; Lin, H.; Chou, K.-C., Irna-psecoll: identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into pseknc, Mol. Ther.-Nucleic Acids, 7, 155-163, (2017)
[22] Guo, F.; Ding, Y.; Li, S. C.; Shen, C.; Wang, L., Protein-protein interface prediction based on hexagon structure similarity, Comput. Biol. Chem., 63, 83-88, (2016)
[23] He, Z.; Zhang, J.; Shi, X.-H.; Hu, L.-L.; Kong, X.; Cai, Y.-D.; Chou, K.-C., Predicting drug-target interaction networks based on functional groups and biological features, PloS One, 5, e9603, (2010)
[24] Hu, G.; Yan, W.; Zhou, J.; Shen, B., Residue interaction network analysis of dronpa and a DNA clamp, J. Theor. Biol., 348, 55-64, (2014) · Zbl 1412.92244
[25] Hu, G.; Zhou, J.; Yan, W.; Chen, J.; Shen, B., The topology and dynamics of protein complexes: insights from intra-molecular network theory, Curr. Protein Peptide Sci., 14, 121-132, (2013)
[26] Jia, J.; Liu, Z.; Xiao, X.; Liu, B.; Chou, K.-C., Ippi-esml: an ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into pseaac, J. Theor. Biol., 377, 47-56, (2015)
[27] Jia, J.; Liu, Z.; Xiao, X.; Liu, B.; Chou, K.-C., Ippbs-opt: a sequence-based ensemble classifier for identifying protein-protein binding sites by optimizing imbalanced training datasets, Molecules, 21, 95, (2016)
[28] Jia, J.; Liu, Z.; Xiao, X.; Liu, B.; Chou, K.-C., Identification of protein-protein binding sites by incorporating the physicochemical properties and stationary wavelet transforms into pseudo amino acid composition, J. Biomol. Struct. Dyn., 34, 1946-1961, (2016)
[29] Jia, J.; Liu, Z.; Xiao, X.; Liu, B.; Chou, K.-C., Psuc-lys: predict lysine succinylation sites in proteins with pseaac and ensemble random forest approach, J. Theor. Biol., 394, 223-230, (2016) · Zbl 1343.92153
[30] Jiao, X.; Chang, S.; Li, C.-hH.; Wang, C.-xX., Construction and application of the weighted amino acid network based on energy, Phys. Rev. E, 75, (2007)
[31] Kandaswamy, K. K.; Chou, K.-C.; Martinetz, T.; Moller, S.; Suganthan, P. N.; Sridharan, S.; Pugalenthi, G., AFP-pred: a random forest approach for predicting antifreeze proteins from sequence-derived properties, J. Theor. Biol., 270, 56-62, (2011)
[32] Lee, B.; Richards, F. M., The interpretation of protein structures: estimation of static accessibility, J. Mol. Biol., 55, 379IN3-400IN4, (1971)
[33] Lin, H.; Deng, E.-Z.; Ding, H.; Chen, W.; Chou, K.-C., Ipro54-pseknc: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition, Nucleic Acids Res., 42, 12961-12972, (2014)
[34] Lin, W.-Z.; Fang, J.-A.; Xiao, X.; Chou, K.-C., Idna-prot: identification of DNA binding proteins using random forest with grey model, PloS One, 6, e24756, (2011)
[35] Liu, B.; Yang, F.; Chou, K.-C., 2L-pirna: a two-layer ensemble classifier for identifying PIWI-interacting RNAs and their function, Mol. Ther.-Nucleic Acids, 7, 267-277, (2017)
[36] Liu, B.; Wang, S.; Long, R.; Chou, K.-C., Irspot-EL: identify recombination spots with an ensemble learning approach, Bioinformatics, 33, 35-41, (2017)
[37] Ma, X.; Sun, X., Sequence-based predictor of ATP-binding residues using random forest and mrmr-IFS feature selection, J. Theor. Biol., 360, 59-66, (2014) · Zbl 1343.92014
[38] Mitternacht, S., Freesasa: an open source C library for solvent accessible surface area calculations, F1000Res., 5, 189, (2016)
[39] Miyazawa, S.; Jernigan, R. L., Estimation of effective interresidue contact energies from protein crystal-structures - quasi-chemical approximation, Macromolecules, 18, 534-552, (1985)
[40] Miyazawa, S.; Jernigan, R. L., Residue-residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading, J. Mol. Biol., 256, 623-644, (1996)
[41] Nan, D.; Zhang, X., Prediction of hot regions in protein-protein interactions based on complex network and community detection, (Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on. IEEE, (2013)), 17-23
[42] Ofran, Y.; Rost, B., Analysing six types of protein-protein interfaces, J. Mol. Biol., 325, 377-387, (2003)
[43] Ofran, Y.; Rost, B., Predicted protein-protein interaction sites from local sequence information, FEBS Lett., 544, 236-239, (2003)
[44] Pavlopoulos, G. A.; Secrier, M.; Moschopoulos, C. N.; Soldatos, T. G.; Kossida, S.; Aerts, J.; Schneider, R.; Bagos, P. G., Using graph theory to analyze biological networks, BioData Min., 4, 10, (2011)
[45] Pons, C.; Glaser, F.; Fernandez-Recio, J., Prediction of protein-binding areas by small-world residue networks and application to docking, BMC Bioinf., 12, 378, (2011)
[46] Qiu, W.-R.; Jiang, S.-Y.; Xu, Z.-C.; Xiao, X.; Chou, K.-C., Irnam5C-psednc: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition, Oncotarget, 8, 41178-41188, (2017)
[47] Qiu, Z.; Wang, X., Improved prediction of protein ligand-binding sites using random forests, Protein Pept. Lett., 18, 1212-1218, (2011)
[48] Qiu, Z.; Wang, X., Prediction of protein-protein interaction sites using patch-based residue characterization, J. Theor. Biol., 293, 143-150, (2012) · Zbl 1307.92088
[49] Sowmya, G.; Ranganathan, S., Discrete structural features among interface residue-level classes, BMC Bioinf., 16,, (2015), S8
[50] Sowmya, G.; Breen, E. J.; Ranganathan, S., Linking structural features of protein complexes and biological function, Protein Sci., 24, 1486-1494, (2015)
[51] Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J. C.; Sheridan, R. P.; Feuston, B. P., Random forest: a classification and regression tool for compound classification and QSAR modeling, J. Chem. Inf. Comput. Sci., 43, 1947-1958, (2003)
[52] Szilagyi, A.; Zhang, Y., Template-based structure modeling of protein-protein interactions, Curr. Opin. Struct. Biol., 24, 10-23, (2014)
[53] Tien, M. Z.; Meyer, A. G.; Sydykova, D. K.; Spielman, S. J.; Wilke, C. O., Maximum allowed solvent accessibilites of residues in proteins, PloS One, 8, e80635, (2013)
[54] Wei, Z.-S.; Han, K.; Yang, J.-Y.; Shen, H.-B.; Yu, D.-J., Protein-protein interaction sites prediction by ensembling SVM and sample-weighted random forests, Neurocomputing, 193, 201-212, (2016)
[55] Wu, Z.-C.; Xiao, X.; Chou, K.-C., Iloc-plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites, Mol. BioSyst., 7, 3287-3297, (2011)
[56] Xia, J.-F.; Zhao, X.-M.; Song, J.; Huang, D.-S., APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility, BMC Bioinf., 11, 174, (2010)
[57] Xu, Y.; Shao, X.-J.; Wu, L.-Y.; Deng, N.-Y.; Chou, K.-C., Isno-aapair: incorporating amino acid pairwise coupling into pseaac for predicting cysteine S-nitrosylation sites in proteins, PeerJ, 1, e171, (2013)
[58] Yan, W.; Hu, G.; Shen, B., Network analysis of protein structures: the comparison of three topologies, Curr. Bioinf., 11, 480-489, (2016)
[59] Yan, W.; Sun, M.; Hu, G.; Zhou, J.; Zhang, W.; Chen, J.; Chen, B.; Shen, B., Amino acid contact energy networks impact protein structure and evolution, J. Theor. Biol., 355, 95-104, (2014)
[60] Yang, J. Y.; Roy, A.; Zhang, Y., Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment, Bioinformatics, 29, 2588-2595, (2013)
[61] Yang, J. Y.; Wang, Y.; Zhang, Y., Resq: an approach to unified estimation of B-factor and residue-specific error in protein structure prediction, J. Mol. Biol., 428, 693-701, (2016)
[62] Ye, L.; Kuang, Q.; Jiang, L.; Luo, J.; Jiang, Y.; Ding, Z.; Li, Y.; Li, M., Prediction of hot spots residues in protein-protein interface using network feature and microenvironment feature, Chemom. Intell. Lab. Syst., 131, 16-21, (2014)
[63] Zellner, H.; Staudigel, M.; Trenner, T.; Bittkowski, M.; Wolowski, V.; Icking, C.; Merkl, R., Prescont: predicting protein-protein interfaces utilizing four residue properties, Proteins, 80, 154-168, (2012)
[64] Zhang, Q. C.; Deng, L.; Fisher, M.; Guan, J.; Honig, B.; Petrey, D., Predus: a web server for predicting protein interfaces using structural neighbors, Nucleic Acids Res., 39, W283-W287, (2011)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.