×

zbMATH — the first resource for mathematics

PSSM-Suc: accurately predicting succinylation using position specific scoring matrix into bigram for feature extraction. (English) Zbl 1381.92002
Summary: Post-translational modification (PTM) is a covalent and enzymatic modification of proteins, which contributes to diversify the proteome. Despite many reported PTMs with essential roles in cellular functioning, lysine succinylation has emerged as a subject of particular interest. Because its experimental identification remains a costly and time-consuming process, computational predictors have been recently proposed for tackling this important issue. However, the performance of current predictors is still very limited. In this paper, we propose a new predictor called PSSM-Suc which employs evolutionary information of amino acids for predicting succinylated lysine residues. Here we described each lysine residue in terms of profile bigrams extracted from position specific scoring matrices. We compared the performance of PSSM-Suc to that of existing predictors using a widely used benchmark dataset. PSSM-Suc showed a significant improvement in performance over state-of-the-art predictors. Its sensitivity, accuracy and Matthews correlation coefficient were 0.8159, 0.8199 and 0.6396, respectively.

MSC:
92-08 Computational methods for problems pertaining to biology
92C40 Biochemistry, molecular biology
62P10 Applications of statistics to biology and medical sciences; meta analysis
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Alpaydin, E., Introduction to machine learning, (2014), The MIT Press · Zbl 1298.68002
[2] Altschul, S. F.; Madden, T. L.; Schäffer, A. A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D. J., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucl. Acids Res., 25, 3389-3402, (1997)
[3] Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E., The protein data bank, Nucl. Acids Res., 28, 235-242, (2000)
[4] Bhagwat, M.; Aravind, L., PSI-BLAST tutorial, Comparative Genomics, Vol. 1 and 2, 177-186, (2007), Humana Press Totowa (NJ)
[5] Chen, W.; Feng, P.; Ding, H.; Lin, H.; Chou, K.-C., Irna-methyl: identifying N6-methyladenosine sites using pseudo nucleotide composition, Anal. Biochem., 490, 26-33, (2015)
[6] Chou, K.-C., Some remarks on protein attribute prediction and pseudo amino acid composition, J. Theor. Biol., 273, 236-247, (2011) · Zbl 1405.92212
[7] Chou, K.-C.; Shen, H.-B., Cell-ploc: a package of web servers for predicting subcellular localization of proteins in various organisms, Nat. Protocols, 3, 153-162, (2008)
[8] Dehzangi, A.; Paliwal, K.; Lyons, J.; Sharma, A.; Sattar, A., Proposing a highly accurate protein structural class predictor using segmentation-based features, BMC Genom., 15, S2, (2014)
[9] Dehzangi, A.; Heffernan, R.; Sharma, A.; Lyons, J.; Paliwal, K.; Sattar, A., Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou’s general pseaac, J. Theor. Biol., 364, 284-294, (2015) · Zbl 1405.92092
[10] Dehzangi, A.; Sohrabi, S.; Heffernan, R.; Sharma, A.; Lyons, J.; Paliwal, K.; Sattar, A., Gram-positive and Gram-negative subcellular localization using rotation forest and physicochemical-based features, BMC Bioinf., 16, S1, (2015)
[11] Ding, H.; Deng, E.-Z.; Yuan, L.-F.; Liu, L.; Lin, H.; Chen, W.; Chou, K.-C., Ictx-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels, BioMed Res. Int., 2014, (2014)
[12] Faraggi, E.; Zhang, T.; Yang, Y.; Kurgan, L.; Zhou, Y., SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles, J. Comput. Chem., 33, 259-267, (2012)
[13] Hajisharifi, Z.; Piryaiee, M.; Beigi, M. M.; Behbahani, M.; Mohabatkar, H., Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via ames test, J. Theor. Biol., 341, 34-40, (2014) · Zbl 1411.92232
[14] Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I. H., The WEKA data mining software: an update, SIGKDD Explor., 11, 10-18, (2009)
[15] Hasan, M. M.; Yang, S.; Zhou, Y.; Mollah, M. N.H., Succinsite: a computational tool for the prediction of protein succinylation sites by exploiting the amino acid patterns and properties, Mole. BioSyst., 12, 786-795, (2016)
[16] Heffernan, R.; Paliwal, K.; Lyons, J.; Dehzangi, A.; Sharma, A.; Wang, J.; Sattar, A.; Yang, Y.; Zhou, Y., Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning, Sci. Rep., 5, 11476, (2015)
[17] Hou, T.; Zheng, G.; Zhang, P.; Jia, J.; Li, J.; Xie, L.; Wei, C.; Li, Y., Lacep: lysine acetylation site prediction using logistic regression classifiers, PLoS ONE, 9, e89575, (2014)
[18] Jensen, O. N., Modification-specific proteomics: characterization of post-translational modifications by mass spectrometry, Curr. Opinion Chem. Biol., 8, 33-41, (2004)
[19] Jia, J.; Liu, Z.; Xiao, X.; Liu, B.; Chou, K.-C., Psuc-lys: predict lysine succinylation sites in proteins with pseaac and ensemble random forest approach, J. Theor. Biol., 394, 223-230, (2016) · Zbl 1343.92153
[20] Jia, J.; Liu, Z.; Xiao, X.; Liu, B.; Chou, K.-C., Isuc-pseopt: identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset, Anal. Biochem., 497, 48-56, (2016)
[21] Li, W.; Godzik, A., Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences, Bioinformatics, 22, 1658-1659, (2006)
[22] Liu, B.; Fang, L.; Wang, S.; Wang, X.; Li, H.; Chou, K.-C., Identification of microrna precursor with the degenerate K-tuple or kmer strategy, J. Theor. Biol., 385, 153-159, (2015)
[23] Liu, Z.; Xiao, X.; Qiu, W.-R.; Chou, K.-C., Idna-methyl: identifying DNA methylation sites via pseudo trinucleotide composition, Anal. Biochem., 474, 69-77, (2015)
[24] Liu, Z.; Cao, J.; Gao, X.; Zhou, Y.; Wen, L.; Yang, X.; Yao, X.; Ren, J.; Xue, Y., CPLA 1.0: an integrated database of protein lysine acetylation, Nucl. Acids Res., 39, D1029-D1034, (2011)
[25] Liu, Z.; Wang, Y.; Gao, T.; Pan, Z.; Cheng, H.; Yang, Q.; Cheng, Z.; Guo, A.; Ren, J.; Xue, Y., CPLM: a database of protein lysine modifications, Nucl. Acids Res., 42, D531-D536, (2014)
[26] McGuffin, L. J.; Bryson, K.; Jones, D. T., The PSIPRED protein structure prediction server, Bioinformatics, 16, 404-405, (2000)
[27] Paliwal, K. K.; Sharma, A.; Lyons, J.; Dehzangi, A., A tri-Gram based feature extraction technique using linear probabilities of position specific scoring matrix for protein fold recognition, IEEE Trans. NanoBioscience, 13, 44-50, (2014)
[28] Park, J.; Chen, Y.; Tishkoff, D. X.; Peng, C.; Tan, M.; Dai, L.; Xie, Z.; Zhang, Y.; Zwaans, B. M.M.; Skinner, M. E.; Lombard, D. B.; Zhao, Y., SIRT5-mediated lysine desuccinylation impacts diverse metabolic pathways, Mole. Cell, 50, 919-930, (2013)
[29] Qiu, W.-R.; Xiao, X.; Lin, W.-Z.; Chou, K.-C., Imethyl-pseaac: identification of protein methylation sites via a pseudo amino acid composition approach, BioMed Res. Int., 2014, (2014)
[30] Qiu, W.-R.; Xiao, X.; Lin, W.-Z.; Chou, K.-C., Iubiq-lys: prediction of lysine ubiquitination sites in proteins by extracting sequence evolution information via a gray system model, J. Biomole. Struct. Dyn., 33, 1731-1742, (2015)
[31] Quinlan, J. R., C4.5: programs for machine learning, (1992), Morgan Kaufmann San Francisco, California, USA
[32] Sharma, A.; Lyons, J.; Dehzangi, A.; Paliwal, K. K., A feature extraction technique using bi-Gram probabilities of position specific scoring matrix for protein fold recognition, J. Theor. Biol., 320, 41-46, (2013) · Zbl 1406.92471
[33] Sharma, R.; Dehzangi, A.; Lyons, J.; Paliwal, K.; Tsunoda, T.; Sharma, A., Predict Gram-positive and Gram-negative subcellular localization via incorporating evolutionary information and physicochemical features into Chou’s general pseaac, IEEE Trans. NanoBioscience, 14, 915-926, (2015)
[34] Taherzadeh, G.; Zhou, Y.; Liew, A. W.-C.; Yang, Y., Sequence-based prediction of protein-carbohydrate binding sites using support vector machines, J. Chem. Inf. Model., 56, 2115-2122, (2016)
[35] Taherzadeh, G.; Yang, Y.; Zhang, T.; Liew, A. W.-C.; Zhou, Y., Sequence-based prediction of protein-peptide binding sites using support vector machine, J. Comput. Chem., 37, 1223-1229, (2016)
[36] Walsh, C. T.; Garneau-Tsodikova, S.; Gatto, G. J., Protein posttranslational modifications: the chemistry of proteome diversifications, Angewandte Chemie Int. Ed., 44, 7342-7372, (2005)
[37] Weinert, B. T.; Schölz, C.; Wagner, S. A.; Iesmantavicius, V.; Su, D.; Daniel, J. A.; Choudhary, C., Lysine succinylation is a frequently occurring modification in prokaryotes and eukaryotes and extensively overlaps with acetylation, Cell Rep., 4, 842-851, (2013)
[38] Xiao, X.; Min, J.-L.; Lin, W.-Z.; Liu, Z.; Cheng, X.; Chou, K.-C., Idrug-target: predicting the interactions between drug compounds and target proteins in cellular networking via benchmark dataset optimization approach, J. Biomole. Struct. Dyn., 33, 2221-2233, (2015)
[39] Xie, Z.; Dai, J.; Dai, L.; Tan, M.; Cheng, Z.; Wu, Y.; Boeke, J. D.; Zhao, Y., Lysine succinylation and lysine malonylation in histones, Mole. Cell. Proteomics, 11, 100-107, (2012)
[40] Xu, H.-D.; Shi, S.-P.; Wen, P.-P.; Qiu, J.-D., Succfind: a novel succinylation sites online prediction tool via enhanced characteristic strategy, Bioinformatics, 31, 3748-3750, (2015)
[41] Xu, Y.; Chou, K.-C., Recent progress in predicting posttranslational modification sites in proteins, Curr. Top. Med. Chem., 16, 591-603, (2016)
[42] Xu, Y.; Ding, Y.-X.; Ding, J.; Lei, Y.-H.; Wu, L.-Y.; Deng, N.-Y., Isuc-pseaac: predicting lysine succinylation in proteins by incorporating peptide position-specific propensity, Sci. Rep., 5, 10184, (2015)
[43] Zhang, Z.; Tan, M.; Xie, Z.; Dai, L.; Chen, Y.; Zhao, Y., Identification of lysine succinylation as a new post-translational modification, Nat. Chem. Biol., 7, 58-63, (2011)
[44] Zhao, X.; Ning, Q.; Chai, H.; Ma, Z., Accurate in silico identification of protein succinylation sites using an iterative semi-supervised learning technique, J. Theor. Biol., 374, 60-65, (2015) · Zbl 1341.92023
[45] Zhen, S.; Deng, X.; Wang, J.; Zhu, G.; Cao, H.; Yuan, L.; Yan, Y., First comprehensive proteome analyses of lysine acetylation and succinylation in seedling leaves of brachypodium distachyon L, Sci. Rep., 6, 31576, (2016)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.