zbMATH — the first resource for mathematics

MemHyb: predicting membrane protein types by hybridizing SAAC and PSSM. (English) Zbl 1307.92308
Summary: About 50% of available drugs are targeted against membrane proteins. Knowledge of membrane protein’s structure and function has great importance in biological and pharmacological research. Therefore, an automated method is exceedingly advantageous, which can help in identifying the new membrane protein types based on their primary sequence. In this paper, we tackle the interesting problem of classifying membrane protein types using their sequence information. We consider both evolutionary and physicochemical features and provide them to our classification system based on support vector machine (SVM) with error correction code. We employ a powerful sequence encoding scheme by fusing position specific scoring matrix and split amino acid composition to effectively discriminate membrane protein types. Linear, polynomial, and RBF based-SVM with Bose, Chaudhuri, Hocquenghem coding are trained and tested. The highest success rate of 91.1% and 93.4% on two datasets is obtained by RBF-SVM using leave-one-out cross-validation. Thus, our proposed approach is an effective tool for the discrimination of membrane protein types and might be helpful to researchers/academicians working in the field of drug discovery, cell biology, and bioinformatics. The web server for the proposed MemHyb-SVM is accessible at

92D20 Protein sequences, DNA sequences
92C40 Biochemistry, molecular biology
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI
[1] Afridi, T.H.; Khan, A.; Lee, Y.S., Mito-GSAAC: mitochondria prediction using genetic ensemble classifier and split amino acid composition, Amino acids, (2011)
[2] Cai, Y.D.; Zhou, G.P.; Chou, K.C., Support vector machines for predicting membrane protein types by using functional domain composition, Biophys. J., 84, 3257-3263, (2003)
[3] Cai, Y.D.; Lin, S.; Chou, K.C., Support vector machines for prediction of protein signal sequences and their cleavage sites, Peptides, 24, 159-161, (2003)
[4] Cai, Y.D.; Liu, X.J.; Xu, X.B.; Chou, K.C., Support vector machines for predicting the specificity of galnac-transferase, Peptides, 23, 205-208, (2002)
[5] Cai, Y.D.; Liu, X.J.; Xu, X.B.; Chou, K.C., Support vector machines for predicting HIV protease cleavage sites in protein, J. comput. chem., 23, 267-274, (2002)
[6] Cai, Y.D.; Liu, X.J.; Xu, X.B.; Chou, K.C., Support vector machines for the classification and prediction of beta-turn types, J. pept. sci., 8, 297-301, (2002)
[7] Cai, Y.D.; Ricardo, P.W.; Jen, C.H.; Chou, K.C., Application of SVM to predict membrane protein types, J. theoret. biol., 226, 373-376, (2004)
[8] Chou, K.C., Prediction of protein subcellular attributes using pseudo-amino acid composition, Proteins: struct. funct. genet., 43, 246-255, (2001)
[9] Chou, K.C., Pseudo amino acid composition and its applications in bioinformatics, proteomics and system biology, Curr. proteomics, 6, 262-274, (2009)
[10] Chou, K.C., Some remarks on protein attribute prediction and pseudo amino acid composition (50th anniversary year review), J. theor. biol., 273, 236-247, (2011) · Zbl 1405.92212
[11] Chou, K.C.; Zhang, C.T., Review: prediction of protein structural classes, Crit. rev. biochem. mol. biol., 30, 275-349, (1995)
[12] Chou, K.C.; Elrod, D.E., Prediction of membrane protein types and subcellular location, Proteins: struct. funct. genet., 34, 137-153, (1999)
[13] Chou, K.C.; Cai, Y.D., Prediction of membrane protein types by incorporating amphipathic effects, J. chem. inf. model, 45, 407-413, (2005)
[14] Chou, K.C.; Shen, H.B., Predicting eukaryotic protein subcellular location by fusing optimized evidence-theoretic K-nearest neighbor classifiers, J. proteome res., 5, 1888-1897, (2006)
[15] Chou, K.C.; Shen, H.B., Hum-ploc: a novel ensemble classifier for predicting human protein subcellular localization, Biochem. biophys. res. commun., 347, 150-157, (2006)
[16] Chou, K.C.; Shen, H.B., Review: recent progresses in protein subcellular location prediction, Anal. biochem., 370, 1-16, (2007)
[17] Chou, K.C.; Shen, H.S., Memtype-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through pse-PSSM, Biochem. biophys. res. commun., 360, 339-345, (2007)
[18] Chou, K.C.; Shen, H.B., Review: recent advances in developing web-servers for predicting protein attributes, Nat. sci., 2, 63-92, (2009)
[19] Chou, K.C.; Shen, H.B., Plant-mploc: a top-down strategy to augment the power for predicting plant protein subcellular localization, Plos one, 5, e11335, (2010)
[20] Chou, K.C.; Wu, Z.C.; Xiao, X., Iloc-euk: a multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins, Plos one, 6, e18258, (2011)
[21] Esmaeili, M.; Mohabatkar, H.; Mohsenzadeh, S., Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses, J. theoret. biol., 263, 203-209, (2010)
[22] Gao, Q.B.; Ye, X.F.; Jin, Z.C.; He, J., Improving discrimination of outer membrane proteins by fusing different forms of pseudo amino acid composition, Anal. biochem., 398, 52-59, (2010)
[23] Gorenstein, D.C.; Ziegler, N., A class of error correcting codes in pm symbols, J. soc. indus. applied math., 9, 207-214, (1961) · Zbl 0154.44103
[24] Gu, Q.; Ding, Y.S.; Zhang, T.L., Prediction of G-protein-coupled receptor classes in low homology using Chou’s pseudo amino acid composition with approximate entropy and hydrophobicity patterns, Protein pept. lett., 17, 559-567, (2010)
[25] Hayat, M.; Khan, A., Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition, J. theoret. biol., 271, 10-17, (2011) · Zbl 1405.92217
[26] Hayat, M.; Khan, A.; Yeasin, M., Prediction of membrane proteins using split amino acid and ensemble classifiction, Amino acids, (2011)
[27] Jia, C.; Liu, T.; Chang, K.; Zhai, Y., A., Prediction of mitochondrial proteins of malaria parasite using bi-profile Bayes feature extraction, Biochimie, 1-5, (2011)
[28] Jones, D.T., Do transmembrane protein superfolds exist?, FEBS lett., 423, 281-285, (1998)
[29] Khan, A.; Javed, S.J., Predicting regularities in lattice constants of gdfeo3-type perovskites, Acta crystallogr., B64, 120-122, (2008)
[30] Khan, A.; Tahir, S.F.; Choi, T.S., Intelligent extraction of a digital watermark from a distorted image, IEICE trans. inf. syst., 7, (2008)
[31] Khan, A.; Khan, M.F.; Choi, T.S., Proximity based GPCRs prediction in transform domain, Biochem. biophys. res. commun., 371, 411-415, (2008)
[32] Khan, A.; Majid, A.; Choi, T.S., Predicting protein subcellular location: exploiting amino acid based sequence of feature spaces and fusion of diverse classifiers, Amino acids, 38, 347-350, (2010)
[33] Khan, A.; Majid, A.; Hayat, M., CE-ploc: a novel diversity based fusion of classifiers for predicting protein subcellular locations, Comput. biol. chem., 35, 218-229, (2011) · Zbl 1226.92020
[34] Khan, A.; Tahir, S.F.; Majid, A.; Choi, T.S., Machine learning based adaptive watermark decoding in view of an anticipated attack, Pattern recognition, 41, 2594-2610, (2008) · Zbl 1151.68585
[35] Kouzani, A.Z.; Nasireding, G., Multilabel classification by BCH code and random forests, Int. J. recent trends eng., 2, 113-116, (2009)
[36] Li, F.M.; Li, Q.Z., Predicting protein subcellular location using Chou’s pseudo amino acid composition and improved hybrid approach, Protein pept. lett., 15, 612-616, (2008)
[37] Lin, H., The modified Mahalanobis discriminant for predicting outer membrane proteins by using Chou’s pseudo amino acid composition, J. theoret. biol., 252, 350-356, (2008)
[38] Lin, H.; Ding, H.; Feng, B.; Guo, F.B.; Zhang, A.Y.; Huang, J., Predicting subcellular localization of mycobacterial proteins by using Chou’s pseudo amino acid composition, Protein pept. lett., 15, 739-744, (2008)
[39] Lin, S.; Costello, D.J., Error control coding: fundamentals and applications, (1983), Prentice-Hall, Inc. Englewood Cliffs, New Jersey, (pp. 141-180)
[40] Liu, H.; Wang, M.; Chou, K.C., Low-frequency Fourier spectrum for predicting membrane protein types, Biochem. biophys. res. commun., 336, 737-739, (2005)
[41] Liu, T.; Zheng, X.; Wang, J., Prediction of protein structural class for low-similarity sequences using support vector machine and PSI-BLAST profile, Biochimie, 92, 1330-1334, (2010)
[42] Miaou, S.G.; Lee, T.S.; Chen, C.M., BCH coded watermarks for error-prone transmission of MPEG video, Lect. notes comput. sci., 654-661, (2001) · Zbl 1031.68836
[43] Mohabatkar, H., Prediction of cyclin proteins using Chou’s pseudo amino acid composition, Protein pept. lett., 17, 1207-1214, (2010)
[44] Mohabatkar, H.; Mohammad Beigi, M.; Esmaeili, A., Prediction of GABA(A) receptor proteins using the concept of Chou’s pseudo-amino acid composition and support vector machine, J. theoret. biol., 281, 18-23, (2011)
[45] Naveed, M.; Khan, A., GPCR-mpredictor: multi-level prediction of G protein-coupled receptors using genetic ensemble, Amino acids, (2011)
[46] Qiu, J.D.; Huang, J.H.; Liang, R.P.; Lu, X.Q., Predction of G-protein-coupled receptor classes based on the concept of Chou’s pseudo amino acid composition: an approach from discrete wavelet transform, Anal. biochem., 390, 68-73, (2009)
[47] Qiu, J.D.; Huang, J.H.; Shi, S.P.; Liang, R.P., Using the concept of Chou’s pseudo amino acid composition to predict enzyme family classes: an approach with support vector machine based on discrete wavelet transform, Protein pept. lett., 17, 715-722, (2010)
[48] Qiu, J.D.; Sun, X.U.; Huang, J.H.; Liang, R.P., Prediction of the types of membrane proteins based on discrete wavelet transform and support vector machines, J. protien, 29, 114-119, (2010)
[49] Rehman, Z.U.; Khan, A., GPCR prediction using pseudo amino acid composition and multi-scale energy representation of different physiochemical properties, Anal. biochem., 412, 173-182, (2011)
[50] Rezaei, M.A.; Maleki, P.A.; Karami, Z.; Asadabadi, E.B.; Sherafat, M.A.; Moghaddam, K.A.; Fadaie, M.; Forouzanfar, M., Prediction of membrane protein types by means of wavelet analysis and cascaded neural network, J. theoret. biol., 255, 817-820, (2008)
[51] Schaffer, A.A.; Aravind, L.; Madden, T.L.; Shavirin, S.; Spouge, J.L., Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements, Nucleic acids res., 29, 2994-3005, (2001)
[52] Sklar, B., Digital communications: fundamentals and applications, (2001), Prentice-Hall Inc. · Zbl 0717.94002
[53] Tsoumakas, G.; Katakis, I., Multi-label classification: an overwiew, Int. J. data warehousing, 3, 1-13, (2007)
[54] Tusnady, G.E.; Dosztanyi, Z.; Simon, I., Transmembrane proteins in the protein data bank: identification and classification, Bioinformatics, 20, 2964-2972, (2004)
[55] ()
[56] ()
[57] Verma, R.; Varshney, G.C.; Raghava, G.P.S., Prediction of mitochondrial proteins of malaria parasite using split amino acid composition and PSSM profile, Amino acids, 39, 101-110, (2010)
[58] Wang, L.; Yuan, Z.; Chen, X.; Zhou, Z., The prediction of membrane protein types with NPE, IEICE electron. express, 6, 397-402, (2010)
[59] Wang, M.; Yang, J.; Liu, G.P.; Xu, Z.J.; Chou, K.C., Weighted-support vector machines for predicting membrane protein types based on pseudo amino acid composition, Protein eng. des. sel., 17, 509-516, (2004)
[60] Wang, S.Q.; Yang, J.; Chou, K.C., Using stacking generalization to predict membrane protien types based on pseudo-amino acid, J. theor. biol., 242, 941-946, (2006)
[61] Xiao, X.; Wang, P.; Chou, K.C., GPCR-CA: a cellular automaton image approach for predicting G-protein-coupled receptor functional classes, J. comput. chem., 30, 1414-1423, (2009)
[62] Xiao, X.; Wang, P.; Chou, K.C., GPCR-2L: predicting G protein-coupled receptors and their types by hybridizing two different modes of pseudo amino acid compositions, Mol. biosyst., 7, 911-919, (2011)
[63] Yu, L.; Guo, Y.; Li, Y.; Li, G.; Li, M., Secretp: identifying bacterial secreted proteins by fusing new features into Chou’s pseudo-amino acid composition, J. theoret. biol., 267, 1-6, (2010)
[64] Zeng, Y.H.; Guo, Y.Z.; Xiao, R.Q.; Yang, L.; Yu, L.Z., Using the augmented Chou’s pseudo amino acid composition for predicting protein submitochondria locations based on auto covariance approach, J. theoret. biol., 259, 366-372, (2009)
[65] Zhang, G.Y.; Fang, B.S., Predicting the cofactors of oxidoreductases based on amino acid composition distribution and Chou’s amphiphilic pseudo amino acid composition, J. theoret. biol., 253, 310-315, (2008)
[66] Zhang, S.; Ding, S.; Wang, T., High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure, Biochimie, 1-5, (2011)
[67] Zhou, X.B.; Chen, C.; Li, Z.C.; Zou, X.Y., Using Chou’s amphiphilic pseudoamino acid composition and support vector machine for prediction of enzyme subfamily classes, J. theoret. biol., 248, 546-551, (2007)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.