Evolutionary decision rules for predicting protein contact maps. (English) Zbl 1328.92054

Summary: Protein structure prediction is currently one of the main open challenges in Bioinformatics. The protein contact map is an useful, and commonly used, representation for protein 3D structure and represents binary proximities (contact or non-contact) between each pair of amino acids of a protein. In this work, we propose a multiobjective evolutionary approach for contact map prediction based on physico-chemical properties of amino acids. The evolutionary algorithm produces a set of decision rules that identifies contacts between amino acids. The rules obtained by the algorithm impose a set of conditions based on amino acid properties to predict contacts. We present results obtained by our approach on four different protein data sets. A statistical study was also performed to extract valid conclusions from the set of prediction rules generated by our algorithm. Results obtained confirm the validity of our proposal.


92D20 Protein sequences, DNA sequences
92D15 Problems related to evolution
92C40 Biochemistry, molecular biology
90C90 Applications of mathematical programming
90C59 Approximation methods and heuristics in mathematical programming
90C29 Multi-objective and goal programming
Full Text: DOI


[1] Abu-Doleh AA, Al-Jarrah OM, Alkhateeb A (2011) Protein contact map prediction using multi-stage hybrid intelligence inference systems. J Biomed Inform
[2] Altschul, SF; Madden, TL; Schffer, AA; Zhang, J; Zhang, Z; Miller, W; Lipman, DJ, Gapped blast and psi-blast: a new generation of protein database search programs, Nucleic Acids Res, 25, 3389-3402, (1997)
[3] Andrew Toona, GW, A dynamical approach to contact distance based protein structure determination, J Mol Graph Model, 32, 75-81, (2012)
[4] Asencio Cortes, G; Aguilar-Ruiz, JS, Predicting protein distance maps according to physicochemical properties, J Integr Bioinform, 8, 181, (2011)
[5] Ashkenazy, H; Unger, R; Kliger, Y, Hidden conformations in protein structures, Bioinformatics, 27, 1941-1947, (2011)
[6] Bacardit, J; Stout, M; Hirst, J; Valencia, A; Smith, R; Krasnogor, N, Automated alphabet reduction for protein datasets, BMC Bioinform, 10, 6, (2009)
[7] Bjrkholm, P; Daniluk, P; Kryshtafovych, A; Fidelis, K; Andersson, R; Hvidsten, TR, Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts, Bioinformatics, 25, 1264-1270, (2009)
[8] Calvo, JC; Ortega, J; Anguita, M, Pitagoras-psp: including domain knowledge in a multi-objective approach for protein structure prediction, Neurocomputing, 74, 2675-2682, (2011)
[9] Chen, P; Li, J, Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers, BMC Struct Biol, 10, s2, (2010)
[10] Cheng, J; Baldi, P, Improved residue contact prediction using support vector machines and a large feature set, Bioinformatics, 8, 113, (2007)
[11] Cutello, V; Narzisi, G; Nicosia, G, A multi-objective evolutionary approach to the protein structure prediction problem, J R Soc Interface, 3, 139-151, (2006)
[12] Day RO, Zydallis JB, Lamont GB, Pachter R (2002) Solving the protein structure prediction problem through a multiobjective genetic algorithm. Nanotech 2:32-35
[13] Di Lena, P; Fariselli, P; Margara, L; Vassura, M; Casadio, R, Fast overlapping of protein contact maps by alignment of eigenvectors, Bioinformatics, 26, 2250-2258, (2010) · Zbl 1196.37121
[14] Dodge, C; Schneider, R; Sander, C, The hssp database of protein structure-sequence alignments and family profiles, Nucleic Acids Res, 26, 313-315, (1998)
[15] Duarte, JM; Sathyapriya, R; Stehr, H; Filippis, I; Lappe, M, Optimal contact definition for reconstruction of contact maps, BMC Bioinform, 11, 283, (2010)
[16] Eickholt, J; Wang, Z; Cheng, J, A conformation ensemble approach to protein residue-residue contact, BMC Struct Biol, 11, 38, (2011)
[17] Faraggi, E; Yang, Y; Zhang, S; Zhou, Y, Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction, Structure, 17, 1515-1527, (2009)
[18] Fariselli, P; Olmea, O; Valencia, A; Casadio, R, Prediction of contact map with neural networks and correlated mutations, Protein Eng, 14, 133-154, (2001)
[19] Faure, G; Bornot, A; Brevern, AG, Protein contacts, inter-residue interactions and side-chain modelling, Biochimie, 90, 626-639, (2008)
[20] Fernandez M, Paredes A, Ortiz L, Rosas J (2009) Sistema predictor de estructuras de proteinas utilizando dinamica molecular (modypp). Revista Internacional de Sistemas Computacionales y Electronicos 1:6-16
[21] Furuta, T; Shimizu, K; Terada, T, Accurate prediction of native tertiary structure of protein using molecular dynamics simulation with the aid of the knowledge of secondary structures, Chem Phys Lett, 472, 134-139, (2009)
[22] Gao, X; Bu, D; Xu, J; Li, M, Improving consensus contact prediction via server correlation reduction, BMC Struct Biol, 9, 28, (2009)
[23] Grantham, R, Amino acid difference formula to help explain protein evolution, J Mol Biol, 185, 862-864, (1974)
[24] Gu J, Bourne P (2003) Structural bioinformatics. Wiley-Blackwell, New Jersey
[25] Gupta, N; Mangal, N; Biswas, S, Evolution and similarity evaluation of protein structures in contact map space, Proteins: Struct Funct Bioinform, 59, 196-204, (2005)
[26] Hall M, Frank E, Holmes GBP, Reutemann P, Witten I (2009) The weka data mining software: an update. SIGKDD Explor 11(1):10-18
[27] Jaravine, V; Ibraghimov, I; Yu Orekhov, V, Removal of a time barrier for high-resolution multidimensional nmr spectroscopy, Nat Meth, 3, 605-607, (2006)
[28] Jones, DT, Protein secondary structure prediction based on position-specific scoring matrices, J Mol Biol, 292, 195-202, (1999)
[29] Jones, DT; Buchan, DWA; Cozzetto, D; Pontil, M, Psicov: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments, Bioinformatics, 28, 184-190, (2012)
[30] Judy, MV; Ravichandran, KS; Murugesan, K, A multi-objective evolutionary algorithm for protein structure prediction with immune operators, Comput Methods Biomech Biomed Eng, 12, 407-413, (2009)
[31] Kabsch, W; Sander, C, Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features, Biopolymers, 22, 2577-2637, (1983)
[32] Kawashima, S; Pokarowski, P; Pokarowska, M; Kolinski, A; Katayama, T; Kanehisa, M, Aaindex: amino acid index database, progress report 2008, Nucleic Acids Res, 36, d202-d205, (2008)
[33] Kihara, D, The effect of long-range interactions on the secondary structure formation of proteins, Protein Sci, 14, 1955-1963, (2005)
[34] Kinjo, AR; Horimoto, K; Nishikawa, K, Predicting absolute contact numbers of native protein structure from amino acid sequence, Proteins, 58, 158-165, (2005)
[35] Klein, P; Kanehisa, M; DeLisi, C, Prediction of protein function from sequence properties: discriminant analysis of a data base, Biochim Biophys, 787, 221-226, (1984)
[36] Kloczkowski, A; Jernigan, R; Wu, Z; Song, G; Yang, L; Kolinski, A; Pokarowski, P, Distance matrix-based approach to protein structure prediction, J Struct Funct Genom, 10, 67-81, (2009)
[37] Kyte, J; Doolittle, R, A simple method for displaying the hydropathic character of a protein, J Mol Biol, 157, 105-132, (1982)
[38] Lattman, E, The state of the protein structure initiative, Proteins, 54, 611-615, (2004)
[39] Lavor C, Liberti L, Maculan N, Mucherino A (2012) Recent advances on the discretizable molecular distance geometry problem. Eur J Oper Res 219(3):698-706 · Zbl 1253.05132
[40] Li, Y; Fang, Y; Fang, J, Predicting residue-residue contacts using random forest models, Bioinformatics, 27, 3379-3384, (2011)
[41] Lippi, M; Frasconi, P, Prediction of protein beta-residue contacts by Markov logic networks with grounding-specific weights, Bioinformatics, 25, 2326-2333, (2009)
[42] Lo, A; Chiu, YY; Rdland, EA; Lyu, PC; Sung, TY; Hsu, WL, Predicting helix-helix interactions from residue contacts in membrane proteins, Bioinformatics, 25, 996-1003, (2009)
[43] Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C (2011) Protein 3d structure computed from evolutionary sequence variation. PLoS One 6(12), e28766. doi:10.1371/journal.pone.0028766
[44] Monastyrskyy, B; Fidelis, K; Tramontano, A; Kryshtafovych, A, Evaluation of residue-residue contact predictions in casp9, Proteins: Struct Funct Bioinform, 79, 119-125, (2011)
[45] Murzin, A; Brenner, S; Hubbard, T; Chothia, C, Scop: a structural classification of proteins database for the investigation of sequences and structures, J Mol Biol, 247, 536-540, (1995)
[46] Nagata, K; Randall, A; Baldi, P, Sidepro: a novel machine learning approach for the fast and accurate prediction of side-chain conformations, Proteins, 80, 142-153, (2012)
[47] Plaxco, KW; Simons, KT; Baker, D, Contact order, transition state placement and the refolding rates of single domain proteins, J Mol Biol, 277, 985-994, (1998)
[48] Rajgaria, R; McAllister, SR; Floudas, CA, Towards accurate residue-residue hydrophobic contact prediction for alpha helical proteins via integer linear optimization, Proteins, 74, 929-947, (2009)
[49] Rajgaria, R; Wei, Y; Floudas, CA, Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3d structure prediction method astro-fold, Proteins, 78, 1825-1846, (2010)
[50] Roy, A; Kucukural, A; Zhang, Y, I-tasser: a unified platform for automated protein structure and function prediction, Nat Protoc, 5, 725-738, (2010)
[51] Service, R, Structural biology structural genomics, round 2, Science, 307, 1554-1558, (2005)
[52] Song, J; Burrage, K, Predicting residue-wise contact orders in proteins by support vector regression, BMC Bioinform, 7, 425, (2006)
[53] Stout, M; Bacardit, J; Hirst, JD; Krasnogor, N, Prediction of recursive convex hull class assignments for protein residues, Bioinformatics, 24, 916-923, (2008)
[54] Tegge, AN; Wang, Z; Eickholt, J; Cheng, J, Nncon: improved protein contact map prediction using 2d-recursive neural networks, Nucleic Acids Res, 37, w515-w518, (2009)
[55] Unger, R; Moult, J, Genetic algorithms for protein folding simulations, Biochim Biophys, 231, 75-81, (1993)
[56] Vassura, M; Margara, L; Di Lena, P; Medri, F; Fariselli, P; Casadio, R, Ft-comar: fault tolerant three-dimensional structure reconstruction from protein contact maps, Bioinformatics, 24, 1313-1315, (2008)
[57] Vassura, M; Di Lena, P; Margara, L; Mirto, M; Aloisio, G; Fariselli, P; Casadio, R, Blurring contact maps of thousands of proteins: what we can learn by reconstructing 3d structure, BioData Min, 4, 1, (2011)
[58] Walsh, I; Bau, D; Martin, A; Mooney, C; Vullo, A; Pollastri, G, Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks, BMC Struct Biol, 9, 5, (2009)
[59] Wang, Z; Eickholt, J; Cheng, J, Multicom: a multi-level combination approach to protein structure prediction and its assessments in casp8, Bioinformatics, 26, 882-888, (2010)
[60] Wei, Y; Floudas, CA, Enhanced inter-helical residue contact prediction in transmembrane proteins, Chem Eng Sci, 66, 4356-4369, (2011)
[61] Wu, S; Zhang, Y, A comprehensive assessment of sequence-based and template-based methods for protein contact prediction, Bioinformatics, 24, 924-931, (2008)
[62] Wu, S; Szilagyi, A; Zhang, Y, Improving protein structure prediction using multiple sequence-based contact predictions, Structure, 19, 1182-1191, (2011)
[63] Xue, B; Faraggi, E; Zhou, Y, Predicting residue-residue contact maps by a two-layer, integrated neural-network method, Proteins, 76, 176-183, (2009)
[64] Yang JY, Chen X (2011) A consensus approach to predicting protein contact map via logistic regression. In: Chen J, Wang J, Zelikovsky A (eds) Bioinformatics research and applications—7th international symposium, ISBRA 2011, Changsha, China, May 27-29, 2011. Proceedings, Lecture Notes in Computer Science, vol 6674, pp 136-147. Springer
[65] Zhang, G; Huang, D; Quan, Z, Combining a binary input encoding scheme with rbfnn for globulin protein inter-residue contact map prediction, Pattern Recogn Lett, 16, 1543-1553, (2005)
[66] Zhou, Y; Duan, Y; Yang, Y; Faraggi, E; Lei, H, Trends in template/fragment-free protein structure prediction, Theor Chem Acc: Theory Comput Model (Theor Chim Acta), 128, 3-16, (2011)
[67] Zitzler, E; Thiele, L, Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach, IEEE Trans Evol Comput, 3, 257-271, (1999)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.