×

pLoc_bal-mGneg: predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC. (English) Zbl 1406.92173

Summary: One of the hottest topics in molecular cell biology is to determine the subcellular localization of proteins from various different organisms. This is because it is crucially important for both basic research and drug development. Recently, a predictor called “pLoc-mGneg” was developed for identifying the subcellular localization of Gram-negative bacterial proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mGneg was trained by an extremely skewed dataset in which some subset (subcellular location) was about 5 to 70 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. To alleviate such a consequence, we have developed a new and bias-reducing predictor called pLoc\(_-\)bal-mGneg by quasi-balancing the training dataset. Cross-validation tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mGneg, the existing state-of-the-art predictor in identifying the subcellular localization of Gram-negative bacterial proteins. To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mGneg/, by which users can easily get their desired results without the need to go through the detailed mathematics.

MSC:

92C40 Biochemistry, molecular biology
92C37 Cell biology
62P10 Applications of statistics to biology and medical sciences; meta analysis
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Ahmad, S.; Kabir, M.; Hayat, M., Identification of heat shock protein families and J-protein types by incorporating dipeptide composition into Chou’s general pseaac, Comput. Methods Prog. Biomed., 122, 165-174, (2015)
[2] Ali, F.; Hayat, M., Classification of membrane protein types using voting feature interval in combination with Chou’s pseudo amino acid composition, J. Theor.Biol., 384, 78-83, (2015) · Zbl 1343.92006
[3] Arif, M.; Hayat, M.; Jan, Z., Imem-2LSAAC: A two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into Chou’s pseudo amino acid composition, J. Theor. Biol., 442, 11-21, (2018) · Zbl 1397.92180
[4] Behbahani, M.; Mohabatkar, H.; Nosrati, M., Analysis and comparison of lignin peroxidases between fungi and bacteria using three different modes of Chou’s general pseudo amino acid composition, J. Theor. Biol., 411, 1-5, (2016)
[5] Cai, L.; Huang, T.; Su, J.; Zhang, X.; Chen, W.; Zhang, F.; He, L., Implications of newly identified brain eqtl genes and their interactors in schizophrenia, Mol. Therapy Nucleic Acids, 12, 433-442, (2018)
[6] Cai, Y. D., Predicting subcellular localization of proteins in a hybridization space, Bioinformatics, 20, 1151-1156, (2004)
[7] Cai, Y. D.; Feng, K. Y.; Lu, W. C., Using logitboost classifier to predict protein structural classes, J. Theor. Biol., 238, 172-176, (2006)
[8] Cai, Y. D.; Liu, X. J.; Xu, X. B., , support vector machines for prediction of protein subcellular location by incorporating quasi-sequence-order effect, J. Cellular Biochem., 84, 343-348, (2002)
[9] Cao, D. S.; Xu, Q. S.; Liang, Y. Z., Propy: a tool to generate various modes of Chou’s pseaac, Bioinformatics, 29, 960-962, (2013)
[10] Cao, J. Z.; Liu, W. Q.; Gu, H., Predicting viral protein subcellular localization with Chou’s pseudo amino acid composition and imbalance-weighted multi-label K-nearest neighbor algorithm, Protein Pept. Lett., 19, 1163-1169, (2012)
[11] Cedano, J.; Aloy, P.; Perez-Pons, J. A.; Querol, E., Relation between amino acid composition and cellular location of proteins, J. Mol. Biol, 266, 594-600, (1997)
[12] Chang, T. H.; Wu, L. C.; Lee, T. Y.; Chen, S. P.; Huang, H. D.; Horng, J. T., Euloc: a web-server for accurately predict protein subcellular localization in eukaryotes by incorporating various features of sequence segments into the general form of Chou’s pseaac, J. Comput. Aided Mol. Design, 27, 91-103, (2013)
[13] Chawla, N. V.; Bowyer, K. W.; Hall, L. O.; Kegelmeyer, W. P., SMOTE: synthetic minority over-sampling technique, J. Artif. Intell. Res., 16, 321-357, (2011) · Zbl 0994.68128
[14] Chen, J.; Liu, H.; Yang, J.; Chou, K. C., Prediction of linear B-cell epitopes using amino acid pair antigenicity scale, Amino Acids, 33, 423-428, (2007)
[15] Chen, W.; Ding, H.; Feng, P.; Lin, H., Iacp: a sequence-based tool for identifying anticancer peptides, Oncotarget, 7, 16895-16909, (2016)
[16] Chen, W.; Feng, P.; Ding, H.; Lin, H., Irna-methyl: identifying N6-methyladenosine sites using pseudo nucleotide composition, Anal. Biochem., 490, 26-33, (2015)
[17] Chen, W.; Feng, P.; Ding, H.; Lin, H., Using deformation energy to analyze nucleosome positioning in genomes, Genomics, 107, 69-75, (2016)
[18] Chen, W.; Feng, P.; Yang, H.; Ding, H.; Lin, H., Irna-AI: identifying the adenosine to inosine editing sites in RNA sequences, Oncotarget, 8, 4208-4217, (2017)
[19] Chen, W.; Feng, P.; Yang, H.; Ding, H.; Lin, H., Irna-3typea: identifying 3-types of modification at RNA’s adenosine sites, Mol. Therapy Nucleic Acid, 11, 468-474, (2018)
[20] Chen, W.; Feng, P. M.; Deng, E. Z.; Lin, H., Itis-psetnc: a sequence-based predictor for identifying translation initiation site in human genes using pseudo trinucleotide composition, Anal. Biochem., 462, 76-83, (2014)
[21] Chen, W.; Feng, P. M.; Lin, H., Irspot-psednc: identify recombination spots with pseudo dinucleotide composition, Nucleic Acids Res., 41, e68, (2013)
[22] Chen, W.; Feng, P. M.; Lin, H., Iss-psednc: identifying splicing sites using pseudo dinucleotide composition, Biomed. Res. Int. (BMRI) 2014, (2014)
[23] Chen, W.; Lei, T. Y.; Jin, D. C.; Lin, H., Pseknc: a flexible web-server for generating pseudo K-tuple nucleotide composition, Anal. Biochem., 456, 53-60, (2014)
[24] Chen, W.; Lin, H., Pseudo nucleotide composition or pseknc: an effective formulation for analyzing genomic sequences, Mol. Bio. Syst., 11, 2620-2634, (2015)
[25] Chen, W.; Lin, H.; Feng, P. M.; Ding, C.; Zuo, Y. C., Inuc-physchem: A sequence-based predictor for identifying nucleosomes via physicochemical properties, PLoS ONE, 7, e47843, (2012)
[26] Chen, W.; Tang, H.; Ye, J.; Lin, H., Irna-pseu: identifying RNA pseudouridine sites, Mol. Therapy.Nucleic Acids, 5, e332, (2016)
[27] Chen, Z.; Zhao, P. Y.; Li, F.; Leier, A.; Marquez-Lago, T. T.; Wang, Y.; Webb, G. I.; Smith, A. I.; Daly, R. J.; Song, J., Ifeature: a python package and web server for features extraction and selection from protein and peptide sequences, Bioinformatics, 34, 2499-2502, (2018)
[28] Cheng, X.; Xiao, X., Ploc-mgneg: predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general pseaac, Genomics, 110, 231-239, (2018)
[29] Cheng, X.; Xiao, X., Ploc-mhum: predict subcellular localization of multi-location human proteins via general pseaac to winnow out the crucial GO information, Bioinformatics, 34, 1448-1456, (2018)
[30] Cheng, X.; Zhao, S. G.; Lin, W. Z.; Xiao, X., Ploc-manimal: predict subcellular localization of animal proteins with both single and multiple sites, Bioinformatics, 33, 3524-3531, (2017)
[31] Cheng, X.; Zhao, S. G.; Xiao, X., Iatc-misf: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals, Bioinformatics, 33, 341-346, (2017), (Corrigendum, ibid., 2017, Vol.33, 2610)
[32] Cheng, X.; Zhao, S. G.; Xiao, X., Iatc-mhyb: a hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals, Oncotarget, 8, 58494-58503, (2017)
[33] Chou, K. C., A vectorized sequence-coupling model for predicting HIV protease cleavage sites in proteins, J. Biol. Chem., 268, 16938-16948, (1993)
[34] Chou, K. C., Prediction of protein cellular attributes using pseudo amino acid composition, Proteins Struct. Funct. Genet., 43, 246-255, (2001), (Erratum: ibid., 2001, Vol.44, 60)
[35] Chou, K. C., Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes, Bioinformatics, 21, 10-19, (2005)
[36] Chou, K. C., Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology, Curr. Proteomics, 6, 262-274, (2009)
[37] Chou, K. C., Some remarks on protein attribute prediction and pseudo amino acid composition (50th anniversary year review), J. Theor. Biol., 273, 236-247, (2011) · Zbl 1405.92212
[38] Chou, K. C., Some remarks on predicting multi-label attributes in molecular biosystems, Mol. Biosyst., 9, 1092-1100, (2013)
[39] Chou, K. C., Impacts of bioinformatics to medicinal chemistry, Med. Chem., 11, 218-234, (2015)
[40] Chou, K. C., An unprecedented revolution in medicinal chemistry driven by the progress of biological science, Curr. Topics Med. Chem., 17, 2337-2358, (2017)
[41] Chou, K. C.; Cai, Y. D., A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology, Biochem. Biophys. Res. Commun. (BBRC), 311, 743-747, (2003)
[42] Chou, K. C.; Cai, Y. D., Predicting protein quaternary structure by pseudo amino acid composition, Proteins Struct. Funct. Genet., 53, 282-289, (2003)
[43] Chou, K. C.; Cai, Y. D., Prediction of protein subcellular locations by GO-fund-pseaa predicor, Biochem. Biophys. Res. Commun. (BBRC), 320, 1236-1239, (2004)
[44] Chou, K. C.; Cai, Y. D., Prediction of protease types in a hybridization space, Biochem. Biophys. Res. Commun. (BBRC), 339, 1015-1020, (2006)
[45] Chou, K. C.; Elrod, D. W., Using discriminant function for prediction of subcellular location of prokaryotic proteins, Biochem. Biophys. Res. Commun. (BBRC), 252, 63-68, (1998)
[46] Chou, K. C.; Elrod, D. W., Protein subcellular location prediction, Protein Eng., 12, 107-118, (1999)
[47] Chou, K. C.; Elrod, D. W., Bioinformatical analysis of G-protein-coupled receptors, J. Proteome Res., 1, 429-433, (2002)
[48] Chou, K. C.; Elrod, D. W., Prediction of enzyme family classes, J. Proteome Res., 2, 183-190, (2003)
[49] Chou, K. C.; Shen, H. B., Recent progresses in protein subcellular location prediction, Anal. Biochem., 370, 1-16, (2007)
[50] Chou, K. C.; Shen, H. B., Memtype-2L: A web server for predicting membrane proteins and their types by incorporating evolution information through pse-PSSM, Biochem. Biophys. Res. Comm. (BBRC), 360, 339-345, (2007)
[51] Chou, K. C.; Shen, H. B., Cell-ploc: A package of web servers for predicting subcellular localization of proteins in various organisms, Nat. Protoc., 3, 153-162, (2008)
[52] Chou, K. C.; Shen, H. B., Recent advances in developing web-servers for predicting protein attributes, Nat. Sci., 1, 63-92, (2009)
[53] Chou, K. C.; Zhang, C. T., Review: prediction of protein structural classes, Critical Rev. Biochem. Mol. Biol., 30, 275-349, (1995)
[54] Dehzangi, A.; Heffernan, R.; Sharma, A.; Lyons, J.; Paliwal, K.; Sattar, A., Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou’s general pseaac, J. Theor. Biol., 364, 284-294, (2015) · Zbl 1405.92092
[55] Ding, H.; Deng, E. Z.; Yuan, L. F.; Liu, L.; Lin, H.; Chen, W., Ictx-type: A sequence-based predictor for identifying the types of conotoxins in targeting ion channels, Bio. Med. Res. Int. (BMRI) 2014, (2014)
[56] Ding, Y. S.; Zhang, T. L., Using Chou’s pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier, Pattern Recognit. Lett., 29, 1887-1892, (2008)
[57] Du, P.; Gu, S.; Jiao, Y., Pseaac-general: fast building various modes of general form of Chou’s pseudo amino acid composition for large-scale protein datasets, Int. J. Mol. Sci., 15, 3495-3506, (2014)
[58] Du, P.; Wang, X.; Xu, C.; Gao, Y., Pseaac-builder: A cross-platform stand-alone program for generating various special Chou’s pseudo amino acid compositions, Anal. Biochem., 425, 117-119, (2012)
[59] Ehrlich, J. S.; Hansen, M. D.; Nelson, W. J., Spatio-temporal regulation of rac1 localization and lamellipodia dynamics during epithelial cell-cell adhesion, Dev. Cell., 3, 259-270, (2002)
[60] Emanuelsson, O.; Nielsen, H.; Brunak, S.; von Heijne, G., Predicting subcellular localization of proteins based on their N-terminal amino acid sequence, J. Mol. Biol., 300, 1005-1016, (2000)
[61] Esmaeili, M.; Mohabatkar, H.; Mohsenzadeh, S., Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses, J. Theor. Biol., 263, 203-209, (2010)
[62] Fan, G. L.; Li, Q. Z., Predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition, J. Theor. Biol., 304, 88-95, (2012) · Zbl 1397.92186
[63] Feng, P.; Ding, H.; Yang, H.; Chen, W.; Lin, H., Irna-psecoll: identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into pseknc, Mol. Therapy Nucleic Acids, 7, 155-163, (2017)
[64] Feng, P.; Yang, H.; Ding, H.; Lin, H.; Chen, W., Idna6ma-pseknc: identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into pseknc, Genomics, (2018)
[65] Feng, P. M.; Chen, W.; Lin, H., Ihsp-pseraaac: identifying the heat shock protein families using pseudo reduced amino acid alphabet composition, Anal. Biochem., 442, 118-125, (2013)
[66] Gao, Y.; Shao, S. H.; Xiao, X.; Ding, Y. S.; Huang, Y. S.; Huang, Z. D., Using pseudo amino acid composition to predict protein subcellular location: approached with Lyapunov index, Bessel function, and Chebyshev filter, Amino Acids, 28, 373-376, (2005)
[67] Gardy, J. L.; Laird, M. R.; Chen, F.; Rey, S.; Walsh, C. J.; Ester, M.; Brinkman, F. S., Psortb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis, Bioinformatics, 21, 617-623, (2005)
[68] Georgiou, D. N.; Karakasidis, T. E.; Nieto, J. J.; Torres, A., Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou’s pseudo amino acid composition, J. Theor. Biol., 257, 17-26, (2009) · Zbl 1400.92393
[69] Glory, E.; Murphy, R. F., Automated subcellular location determination and high-throughput microscopy, Dev. Cell., 12, 7-16, (2007)
[70] Gupta, M. K.; Niyogi, R.; Misra, M., An alignment-free method to find similarity among protein sequences via the general form of Chou’s pseudo amino acid composition, SAR QSAR Environ. Res., 24, 597-609, (2013)
[71] Hajisharifi, Z.; Piryaiee, M.; Mohammad Beigi, M.; Behbahani, M.; Mohabatkar, H., Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via ames test, J. Theor. Biol., 341, 34-40, (2014)
[72] Hayat, M.; Iqbal, N., Discriminating protein structure classes by incorporating pseudo average chemical shift to Chou’s general pseaac and support vector machine, Comput. Methods Prog. Biomed., 116, 184-192, (2014)
[73] Hayat, M.; Khan, A., Discriminating outer membrane proteins with fuzzy K-nearest neighbor algorithms based on the general form of Chou’s pseaac, Protein Pept. Lett., 19, 411-421, (2012)
[74] Hoglund, A.; Donnes, P.; Blum, T.; Adolph, H. W.; Kohlbacher, O., Multiloc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition, Bioinformatics, 22, 1158-1165, (2006)
[75] Horton, P.; Park, K. J.; Obayashi, T.; Fujita, N.; Harada, H.; Adams-Collier, C. J.; Nakai, K., Wolf PSORT: protein localization predictor, Nucleic Acids Res., 35, W585-W587, (2007)
[76] Jia, J.; Liu, Z.; Xiao, X., Ippi-esml: an ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into pseaac, J.Theor. Biol., 377, 47-56, (2015)
[77] Jia, J.; Liu, Z.; Xiao, X.; Liu, B., Isuc-pseopt: identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset, Anal. Biochem, 497, 48-56, (2016)
[78] Jia, J.; Liu, Z.; Xiao, X.; Liu, B., Ippbs-opt: A sequence-based ensemble classifier for identifying protein-protein binding sites by optimizing imbalanced training datasets, Molecules, 21, E95, (2016)
[79] Jia, J.; Liu, Z.; Xiao, X.; Liu, B., Icar-psecp: identify carbonylation sites in proteins by monto Carlo sampling and incorporating sequence coupled effects into general pseaac, Oncotarget, 7, 34558-34570, (2016)
[80] Jia, J.; Liu, Z.; Xiao, X.; Liu, B., Psuc-lys: predict lysine succinylation sites in proteins with pseaac and ensemble random forest approach, J. Theor. Biol., 394, 223-230, (2016) · Zbl 1343.92153
[81] Jia, J.; Zhang, L.; Liu, Z.; Xiao, X., Psumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general pseaac, Bioinformatics, 32, 3133-3141, (2016)
[82] Jiang, X.; Wei, R.; Zhang, T. L.; Gu, Q., Using the concept of Chou’s pseudo amino acid composition to predict apoptosis proteins subcellular location: an approach by approximate entropy, Protein Pept. Lett., 15, 392-396, (2008)
[83] Kandaswamy, K. K.; Martinetz, T.; Moller, S.; Suganthan, P. N.; Sridharan, S.; Pugalenthi, G., AFP-pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties, J. Theor. Biol., 270, 56-62, (2011)
[84] Khan, M.; Hayat, M.; Khan, S. A.; Iqbal, N., Unb-DPC: identify mycobacterial membrane protein types by incorporating un-biased dipeptide composition into Chou’s general pseaac, J. Theor. Biol., 415, 13-19, (2017)
[85] Khosravian, M.; Faramarzi, F. K.; Beigi, M. M.; Behbahani, M.; Mohabatkar, H., Predicting antibacterial peptides by the concept of Chou’s pseudo amino acid composition and machine learning methods, Protein Pept. Lett., 20, 180-186, (2013)
[86] Krishnan, M. S., Using Chou’s general pseaac to analyze the evolutionary relationship of receptor associated proteins (RAP) with various folding patterns of protein domains, J. Theor. Biol., 445, 62-74, (2018)
[87] Kumar, R.; Srivastava, A.; Kumari, B.; Kumar, M., Prediction of beta-lactamase and its class by Chou’s pseudo amino acid composition and support vector machine, J. Theor. Biol., 365, 96-103, (2015) · Zbl 1314.92055
[88] Li, F.; Li, C.; Marquez-Lago, T. T.; Leier, A.; Akutsu, T.; Purcell, A. W.; Smith, A. I.; Lightow, T.; Daly, R. J.; Song, J., Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome, Bioinformatics, (2018)
[89] Li, F. M.; Li, Q. Z., Predicting protein subcellular location using Chou’s pseudo amino acid composition and improved hybrid approach, Protein Pept. Lett., 15, 612-616, (2008)
[90] Li, L.; Yu, S.; Xiao, W.; Li, Y.; Li, M.; Huang, L.; Zheng, X.; Zhou, S.; Yang, H., Prediction of bacterial protein subcellular localization by incorporating various features into Chou’s pseaac and a backward feature selection approach, Biochimie, 104, 100-107, (2014)
[91] Lin, H.; Deng, E. Z.; Ding, H.; Chen, W., Ipro54-pseknc: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition, Nucleic Acids Res., 42, 12961-12972, (2014)
[92] Lin, H.; Ding, H.; Feng-Biao Guo, F. B.; Zhang, A. Y.; Huang, J., Predicting subcellular localization of mycobacterial proteins by using Chou’s pseudo amino acid composition, Protein Pept. Lett., 15, 739-744, (2008)
[93] Lin, H.; Wang, H.; Ding, H.; Chen, Y. L.; Li, Q. Z., Prediction of subcellular localization of apoptosis protein using Chou’s pseudo amino acid composition, Acta Biotheoretica, 57, 321-330, (2009)
[94] Lin, J.; Wang, Y., Using a novel adaboost algorithm and Chou’s pseudo amino acid composition for predicting protein subcellular localization, Protein Pept. Lett., 18, 1219-1225, (2011)
[95] Lin, W. Z.; Fang, J. A.; Xiao, X., Idna-prot: identification of DNA binding proteins using random forest with grey model, PLoS ONE, 6, e24756, (2011)
[96] Lin, W. Z.; Fang, J. A.; Xiao, X., Iloc-animal: A multi-label learning classifier for predicting subcellular localization of animal proteins, Mol. BioSyst., 9, 634-644, (2013)
[97] Liu, B.; Fang, L.; Liu, F.; Wang, X.; Chen, J., , identification of real microrna precursors with a pseudo structure status composition approach, PLoS ONE, 10, (2015)
[98] Liu, B.; Fang, L.; Long, R.; Lan, X., Ienhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition, Bioinformatics, 32, 362-369, (2016)
[99] Liu, B.; Li, K.; Huang, D. S., Ienhancer-EL: identifying enhancers and their strength with ensemble learning approach, Bioinformatics, (2018)
[100] Liu, B.; Long, R., Idhs-EL: identifying dnase I hypersensi-tivesites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework, Bioinformatics, 32, 2411-2418, (2016)
[101] Liu, B.; Wang, S.; Long, R., Irspot-EL: identify recombination spots with an ensemble learning approach, Bioinformatics, 33, 35-41, (2017)
[102] Liu, B.; Weng, F.; Huang, D. S., Iro-3wpseknc: identify DNA replication origins by three-window-based pseknc, Bioinformatics, (2018)
[103] Liu, B.; Wu, H.; Zhang, D.; Wang, X., Pse-analysis: a python package for DNA/RNA and protein/peptide sequence analysis based on pseudo components and kernel methods, Oncotarget, 8, 13338-13343, (2017)
[104] Liu, B.; Yang, F., 2L-pirna: A two-layer ensemble classifier for identifying piwi-interacting RNAs and their function, Mol. Therapy Nucleic Acids, 7, 267-277, (2017)
[105] Liu, B.; Yang, F.; Huang, D. S., Ipromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based pseknc, Bioinformatics, 34, 33-40, (2018)
[106] Liu, L.; Ma, Y.; Wang, R. L.; Xu, W. R.; Wang, S. Q., Find novel dual-agonist drugs for treating type 2 diabetes by means of cheminformatics, Drug Des. Devel. Ther., 7, 279-287, (2013)
[107] Liu, Z.; Xiao, X.; Qiu, W. R., Idna-methyl: identifying DNA methylation sites via pseudo trinucleotide composition, Anal. Biochem., 474, 69-77, (2015)
[108] Liu, Z.; Xiao, X.; Yu, D. J.; Jia, J.; Qiu, W. R., Prnam-PC: predicting N-methyladenosine sites in RNA sequences via physical-chemical properties, Anal. Biochem., 497, 60-67, (2016)
[109] Lu, J. J.; Pan, W.; Hu, Y. J.; Wang, Y. T., Multi-target drugs: the trend of drug research and development, PLoS One, 7, e40262, (2012)
[110] Ma, Y.; Wang, S. Q.; Xu, W. R.; Wang, R. L., Design novel dual agonists for treating type-2 diabetes by targeting peroxisome proliferator-activated receptors with core hopping approach, PLoS One, 7, e38546, (2012)
[111] Matsuda, S.; Vert, J. P.; Saigo, H.; Ueda, N.; Toh, H.; Akutsu, T., A novel representation of protein sequences for prediction of subcellular location using support vector machines, Protein Sci., 14, 2804-2813, (2005)
[112] Meher, P. K.; Sahu, T. K.; Saini, V.; Rao, A. R., Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general pseaac, Sci. Rep., 7, 42362, (2017)
[113] Mei, J.; Zhao, J., Prediction of HIV-1 and HIV-2 proteins by using Chou’s pseudo amino acid compositions and different classifiers, Sci. Rep., 8, 2359, (2018)
[114] Mei, J.; Zhao, J., Analysis and prediction of presynaptic and postsynaptic neurotoxins by Chou’s general pseudo amino acid composition and motif features, J. Theor. Biol., 427, 147-153, (2018)
[115] Mei, S., Predicting plant protein subcellular multi-localization by Chou’s pseaac formulation based multi-label homolog knowledge transfer learning, J. Theor. Biol., 310, 80-87, (2012) · Zbl 1337.92065
[116] Mohabatkar, H., Prediction of cyclin proteins using Chou’s pseudo amino acid composition, Protein Pept. Lett., 17, 1207-1214, (2010)
[117] Mohabatkar, H.; Mohammad Beigi, M.; Esmaeili, A., Prediction of GABA(A) receptor proteins using the concept of Chou’s pseudo amino acid composition and support vector machine, J. Theor. Biol., 281, 18-23, (2011) · Zbl 1397.92215
[118] Mohammad, B. M.; Behjati, M.; Mohabatkar, H., Prediction of metalloproteinase family based on the concept of Chou’s pseudo amino acid composition using a machine learning approach, J. Struct. Funct. Genomics, 12, 191-197, (2011)
[119] Mondal, S.; Pai, P. P., Chou’s pseudo amino acid composition improves sequence-based antifreeze protein prediction, J. Theor. Biol., 356, 30-35, (2014)
[120] Mundra, P.; Kumar, M.; Kumar, K. K.; Jayaraman, V. K.; Kulkarni, B. D., Using pseudo amino acid composition to predict protein subnuclear localization: approached with PSSM, Pattern Recognit. Lett., 28, 1610-1615, (2007)
[121] Nakai, K., Protein sorting signals and prediction of subcellular localization, Adv. Protein Chem., 54, 277-344, (2000)
[122] Nakai, K.; Kanehisa, M., A knowledge base for predicting protein localization sites in eukaryotic cells, Genomics, 14, 897-911, (1992)
[123] Nanni, L.; Brahnam, S.; Lumini, A., Wavelet images and Chou’s pseudo amino acid composition for protein classification, Amino Acids, 43, 657-665, (2012)
[124] Nanni, L.; Brahnam, S.; Lumini, A., Prediction of protein structure classes by incorporating different protein descriptors into general Chou’s pseudo amino acid composition, J. Theor. Biol., 360, 109-116, (2014) · Zbl 1343.92387
[125] Nanni, L.; Lumini, A., Genetic programming for creating Chou’s pseudo amino acid based features for submitochondria localization, Amino Acids, 34, 653-660, (2008)
[126] Qiu, W. R.; Jiang, S. Y.; Xu, Z. C.; Xiao, X., Irnam5C-psednc: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition, Oncotarget, 8, 41178-41188, (2017)
[127] Qiu, W. R.; Sun, B. Q.; Xiao, X.; Xu, Z. C., Iptm-mlys: identifying multiple lysine PTM sites and their different types, Bioinformatics, 32, 3116-3123, (2016)
[128] Qiu, W. R.; Sun, B. Q.; Xiao, X.; Xu, Z. C., Ihyd-psecp: identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general pseaac, Oncotarget, 7, 44310-44321, (2016)
[129] Qiu, W. R.; Sun, B. Q.; Xiao, X.; Xu, Z. C.; Jia, J. H., Ikcr-pseens: identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier, Genomics, 110, 239-246, (2018)
[130] Qiu, W. R.; Xiao, X., Irspot-tncpseaac: identify recombination spots with trinucleotide composition and pseudo amino acid components, Int. J. Mol. Sci. (IJMS), 15, 1746-1766, (2014)
[131] Qiu, W. R.; Xiao, X.; Xu, Z. C., Iphos-pseen: identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier, Oncotarget, 7, 51270-51283, (2016)
[132] Rahimi, M.; Bakhtiarizadeh, M. R.; Mohammadi-Sangcheshmeh, A., Oogenesis_pred: A sequence-based method for predicting oogenesis proteins by six different modes of Chou’s pseudo amino acid composition, J. Theor.Biol., 414, 128-136, (2017)
[133] Reinhardt, A.; Hubbard, T., Using neural networks for prediction of the subcellular location of proteins, Nucleic Acids Res., 26, 2230-2236, (1998)
[134] Sahu, S. S.; Panda, G., A novel feature representation method based on Chou’s pseudo amino acid composition for protein structural class prediction, Comput. Biol. Chem., 34, 320-327, (2010) · Zbl 1403.92221
[135] Sharma, R.; Dehzangi, A.; Lyons, J.; Paliwal, K.; Tsunoda, T.; Sharma, A., Predict Gram-positive and Gram-negative subcellular localization via incorporating evolutionary information and physicochemical features into Chou’s general pseaac, IEEE Trans. Nanobiosci., 14, 915-926, (2015)
[136] Shen, H. B., Gneg-mploc: A top-down strategy to enhance the quality of predicting subcellular localization of Gram-negative bacterial proteins, J. Theor. Biol., 264, 326-333, (2010)
[137] Shi, J. Y.; Zhang, S. W.; Pan, Q.; Zhou, G. P., Using pseudo amino acid composition to predict protein subcellular location: approached with amino acid composition distribution, Amino Acids, 35, 321-327, (2008)
[138] Song, J.; Li, F.; Leier, A.; Marquez-Lago, T. T.; Akutsu, T.; Haffari, G.; Webb, G. I.; Pike, R. N., Prosperous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy, Bioinformatics, 34, 684-687, (2018)
[139] Song, J.; Li, F.; Takemoto, K.; Haffari, G.; Akutsu, T.; Webb, G. I., Prevail, an integrative approach for inferring catalytic residues using sequence, structural and network features in a machine learning framework, J. Theor. Biol., 443, 125-137, (2018)
[140] Song, J.; Wang, Y.; Li, F.; Akutsu, T.; Rawlings, N. D.; Webb, G. I., Iprot-sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites, Brief. Bioinform., (2018)
[141] Tahir, M.; Hayat, M., Inuc-STNC: a sequence-based predictor for identification of nucleosome positioning in genomes by extending the concept of SAAC and Chou’s pseaac, Mol Biosyst, 12, 2587-2593, (2016)
[142] Tahir, M.; Hayat, M.; Kabir, M., Sequence based predictor for discrimination of enhancer and their types by applying general form of Chou’s trinucleotide composition, Comput. Methods Prog. Biomed., 146, 69-75, (2017)
[143] Tripathi, P.; Pandey, P. N., A novel alignment-free method to classify protein folding types by combining spectral graph clustering with Chou’s pseudo amino acid composition, J. Theor. Biol., 424, 49-54, (2017)
[144] Wan, S.; Mak, M. W.; Kung, S. Y., GOASVM: A subcellular location predictor by incorporating term-frequency gene ontology into the general form of Chou’s pseudo amino acid composition, J. Theor. Biol., 323, 40-48, (2013) · Zbl 1314.92060
[145] Wang, J.; Yang, B.; Leier, A.; Marquez-Lago, T. T.; Hayashida, M.; Rocker, A.; Yanju, Z.; Akutsu, T.; Strugnell, R. A.; Song, J.; Lithgow, T., Bastion6: a bioinformatics approach for accurate prediction of type VI secreted effectors, Bioinformatics, 34, 2546-2555, (2018)
[146] Wang, J.; Yang, B.; Revote, J.; Leier, A.; Marquez-Lago, T. T.; Webb, G.; Song, J.; Lithgow, T., POSSUM: a bioinformatics toolkit for generating numerical sequence feature descriptors based on PSSM profiles, Bioinformatics, 33, 2756-2758, (2017)
[147] Xiao, X.; Cheng, X.; Su, S.; Nao, Q., Ploc-mgpos: incorporate key gene ontology information into general pseaac for predicting subcellular localization of Gram-positive bacterial proteins, Nat. Sci., 9, 331-349, (2017)
[148] Xiao, X.; Min, J. L.; Lin, W. Z.; Liu, Z.; Cheng, X., Idrug-target: predicting the interactions between drug compounds and target proteins in cellular networking via the benchmark dataset optimization approach, J. Biomol. Struct. Dyn. (JBSD), 33, 2221-2233, (2015)
[149] Xiao, X.; Shao, S.; Ding, Y.; Huang, Z.; Chen, X., Using cellular automata to generate image representation for biological sequences, Amino Acids, 28, 29-35, (2005)
[150] Xiao, X.; Wang, P., Predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image, J. Theor. Biol., 254, 691-696, (2008) · Zbl 1400.92416
[151] Xiao, X.; Wang, P., Inr-physchem: A sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix, PLoS ONE, 7, e30869, (2012)
[152] Xiao, X.; Wu, Z. C., A multi-label classifier for predicting the subcellular localization of Gram-negative bacterial proteins with both single and multiple sites, PLoS ONE, 6, e20592, (2011)
[153] Xu, Y.; Ding, J.; Wu, L. Y., Isno-pseaac: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition, PLoS ONE, 8, e55844, (2013)
[154] Xu, Y.; Shao, X. J.; Wu, L. Y.; Deng, N. Y., Isno-aapair: incorporating amino acid pairwise coupling into pseaac for predicting cysteine S-nitrosylation sites in proteins, PeerJ, 1, e171, (2013)
[155] Xu, Y.; Wen, X.; Wen, L. S.; Wu, L. Y.; Deng, N. Y., Initro-tyr: prediction of nitrotyrosine sites in proteins with general pseudo amino acid composition, PLoS ONE, 9, (2014)
[156] Yang, H.; Qiu, W. R.; Liu, G.; Guo, F. B.; Chen, W.; Lin, H., Irspot-pse6NC: identifying recombination spots in saccharomyces cerevisiae by incorporating hexamer composition into general pseknc, Int. J. Biol. Sci., 14, 883-891, (2018)
[157] Yu, B.; Li, S.; Qiu, W. Y.; Chen, C.; Chen, R. X.; Wang, L.; Wang, M. H.; Zhang, Y., Accurate prediction of subcellular location of apoptosis proteins combining Chou’s pseaac and psepssm based on wavelet denoising, Oncotarget, 8, 107640-107665, (2017)
[158] Zhang, C. T., Monte Carlo simulation studies on the prediction of protein folding types from amino acid composition, Biophys. J., 63, 1523-1529, (1992)
[159] Zhang, C. T., An analysis of protein folding type prediction by seed-propagated sampling and jackknife test, J. Protein Chem., 14, 583-593, (1995)
[160] Zhang, L.; Kong, L., Irspot-ADPM: identify recombination spots by incorporating the associated dinucleotide product model into Chou’s pseudo components, J. Theor. Biol., 441, 1-8, (2018)
[161] Zhang, S.; Duan, X., Prediction of protein subcellular localization with oversampling approach and Chou’s general pseaac, J. Theor. Biol., 437, 239-250, (2018) · Zbl 1394.92047
[162] Zhang, S. W.; Zhang, Y. L.; Yang, H. F.; Zhao, C. H.; Pan, Q., Using the concept of Chou’s pseudo amino acid composition to predict protein subcellular localization: an approach by incorporating evolutionary information and von Neumann entropies, Amino Acids, 34, 565-572, (2008)
[163] Zhong, W. Z.; Zhou, S. F., Molecular science for drug development and biomedicine, Int. J. Mol. Sci., 15, 20072-20078, (2014)
[164] Zhou, G. P.; Assa-Munt, N., Some insights into protein structural class prediction, Proteins Struct. Funct. Genet., 44, 57-59, (2001)
[165] Zhou, G. P.; Doctor, K., Subcellular location prediction of apoptosis proteins, Proteins Struct. Funct. Genet., 50, 44-48, (2003)
[166] Zhou, X. B.; Chen, C.; Li, Z. C.; Zou, X. Y., Using Chou’s amphiphilic pseudo amino acid composition and support vector machine for prediction of enzyme subfamily classes, J. Theor. Biol., 248, 546-551, (2007)
[167] Zuo, Y. C.; Peng, Y.; Liu, L.; Chen, W.; Yang, L.; Fan, G. L., Predicting peroxidase subcellular location by hybridizing different descriptors of Chou’s pseudo amino acid patterns, Anal. Biochem., 458, 14-19, (2014)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.