An improved hybrid global optimization method for protein tertiary structure prediction. (English) Zbl 1187.90276

Summary: First principles approaches to the protein structure prediction problem must search through an enormous conformational space to identify low-energy, near-native structures. In this paper, we describe the formulation of the tertiary structure prediction problem as a nonlinear constrained minimization problem, where the goal is to minimize the energy of a protein conformation subject to constraints on torsion angles and interatomic distances. The core of the proposed algorithm is a hybrid global optimization method that combines the benefits of the \(\alpha \)BB deterministic global optimization approach with conformational space annealing. These global optimization techniques employ a local minimization strategy that combines torsion angle dynamics and rotamer optimization to identify and improve the selection of initial conformations and then applies a sequential quadratic programming approach to further minimize the energy of the protein conformations subject to constraints. The proposed algorithm demonstrates the ability to identify both lower energy protein structures, as well as larger ensembles of low-energy conformations.


90C30 Nonlinear programming
90C90 Applications of mathematical programming
Full Text: DOI


[1] Levinthal, C.: How to fold graciously. In: Debrunner, P., Tsibris, J.C.M., Münck, E. (eds.) Mossbauer Spectroscopy in Biological Systems, pp. 22–24. University of Illinois Press, Urbana (1969)
[2] Anfinsen, C.B.: Principles that govern the folding of protein chains. Science 181(4096), 223–230 (1973)
[3] Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)
[4] Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)
[5] Karplus, K., Barret, Ch., Hughey, R.: Hidden Markov models for detecting remote protein homologies. Bioinformatics 14(10), 846–856 (1998)
[6] Rychlewski, L., Jaroszewski, L., Li, W., Godzik, A.: Comparison of sequence profiles. strategies for structural predictions using sequence information. Proteome Sci. 9, 232–241 (2000)
[7] Narayana, S.V., Argos, P.: Residue contacts in protein structures and implications for protein folding. Int. J. Pept. Protein Res. 24, 25–39 (1984)
[8] Bowie, J.U., Lüthy, R., Eisenberg, D.: A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170 (1991)
[9] Godzik, A., Kolinski, A., Skolnick, J.: Topology fingerprint approach to the inverse folding problem. J. Mol. Biol. 227, 227–238 (1992)
[10] Chothia, C.: One thousand families for the molecular biologist. Nature 357, 543–544 (1992)
[11] Grant, A., Lee, D., Orengo, C.: Progress towards mapping the universe of protein folds. Genome Biol. 5, 107 (2004)
[12] Jones, D.T.: GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Biol. 287, 797–815 (1999)
[13] Skolnick, J., Zhang, Y., Arakaki, A.K., Kolinski, A., Boniecki, M., Szilágyi, A., Kihara, D.: TOUCHSTONE: A unified approach to protein structure prediction. Proteins Struct. Funct. Bioinf. 53, 469–479 (2003)
[14] Skolnick, J., Kihara, D., Zhang, Y.: Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm. Proteins Struct. Funct. Bioinf. 56, 502–518 (2004)
[15] Xu, Y., Xu, D.: Protein threading using PROSPECT: Design and evolution. Proteins Struct. Funct. Bioinf. 40, 343–354 (2000)
[16] Xu, J., Li, M., Kim, D., Xu, Y.: RAPTOR: Optimal protein threading by linear programming. J. Bioinf. Comput. Biol. 1, 95–117 (2003) · Zbl 02178419
[17] Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000) · Zbl 05434923
[18] Simons, K.T., Kooperberg, C., Huang, C., Baker, D.: Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997)
[19] Rohl, C.A., Strauss, C.E.M., Chivian, D., Baker, D.: Modeling structurally variable regions in homologous proteins with Rosetta. Proteins Struct. Funct. Bioinf. 55, 656–677 (2004)
[20] Zhang, Y., Skolnick, J.: Tertiary structure predictions on a comprehensive benchmark of medium to large size proteins. Biophys. J. 87, 2647–2655 (2004)
[21] Zhang, Y., Skolnick, J.: Automated structure prediction of weakly homologous proteins on a genomic scale. Proc. Natl. Acad. Sci. 101, 7594–7599 (2004)
[22] Zhang, Y., Skolnick, J.: SPICKER: A clustering approach to identify near-native protein folds. J. Comput. Chem. 25, 865–871 (2004) · Zbl 05429619
[23] Wu, S., Skolnick, J., Zhang, Y.: Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 5, 17–26 (2007)
[24] Xia, Y., Huang, E.S., Levitt, M., Samudrala, R.: Ab initio construction of protein tertiary structure using a hierarchical approach. J. Mol. Biol. 300, 171–185 (2000)
[25] Kussel, E., Shimada, J., Shakhnovich, E.I.: A structure-based method for derivation of all-atom potentials for protein folding. Proc. Natl. Acad. Sci. 999, 5343–5348 (2002)
[26] Ozkan, S.B., Wu, G.A., Chodera, J.D., Dill, K.A.: Protein folding by zipping and assembly. Proc. Natl. Acad. Sci. 104, 11987–11992 (2007)
[27] Srinivasan, R., Rose, G.D.: LINUS: A hierarchic procedure to predict the fold of a protein. Proteins Struct. Funct. Gen. 22, 81–89 (1995)
[28] Srinivasan, R., Rose, G.D.: Ab initio prediction of protein structure using LINUS. Proteins Struct. Funct. Bioinf. 47, 489–495 (2002)
[29] Zagrovic, B., Snow, C.D., Shirts, M.R., Pande, V.S.: Simulation of folding of a small alpha-helical protein in atomistic detail using worldwide-distributed computing. J. Mol. Biol. 323, 927–937 (2002)
[30] Liwo, A., Arlukowicz, P., Czaplewski, C., Oldziej, S., Pillardy, J., Scheraga, H.A.: A method for optimizing potential-energy functions by hierarchical design of the potential-energy landscape: Application to the UNRES force field. Proc. Natl. Acad. Sci. 99, 1937–1942 (2002)
[31] Liwo, A., Oldziez, S., Pincus, M.R., Wawak, R.J., Rackovsky, S., Scheraga, H.A.: A united-residue force field for off-lattice protein-structure simulations. I. Functional forms and parameters of long-range side-chain interaction potentials from protein crystal data. J. Comput. Chem. 18, 849–873 (1997)
[32] Liwo, A., Pincus, M.R., Wawak, R.J., Rackovsky, S., Oldziej, S., Scheraga, H.A.: A united-residue force field for off-lattice protein structure simulations. II. Parameterization of short-range interactions and determination of weights of energy terms by z-score optimization. J. Comput. Chem. 18, 874–887 (1997)
[33] Lee, J., Scheraga, H.A., Rackovsky, S.: New optimization method for conformational energy calculations on polypeptides: Conformational space annealing. J. Comput. Chem. 18, 1222–1232 (1997)
[34] Lee, J., Pillardy, J., Czaplewski, C., Arnautova, Y., Ripoll, D.R., Liwo, A., Gibson, K.D., Wawak, R.J., Scheraga, H.A.: Efficient parallel algorithms in global optimization of potential energy functions for peptides, proteins and crystals. Comput. Phys. Commun. 128, 399–411 (2000) · Zbl 0953.92003
[35] Czaplewski, C., Liwo, A., Pillardy, J., Oldziej, S., Scheraga, H.A.: Improved conformational space annealing method to treat beta-structure with the UNRES force-field and to enhance scalability of parallel implementation. Polymer 45, 677–686 (2004)
[36] Nanias, M., Chinchio, M., Oldziej, S., Czaplewski, C., Scheraga, H.A.: Protein structure prediction with the UNRES force-field using replica-exchange Monte Carlo-with-minimization; comparison with MCM, CSA, and CFMC. J. Comput. Chem. 26, 1472–1486 (2005) · Zbl 05429664
[37] Klepeis, J.L., Floudas, C.A.: ASTRO-FOLD: A combinatorial and global optimization framework for ab initio prediction of three-dimensional structures of proteins from the amino acid sequence. Biophys. J. 85, 2119–2146 (2003)
[38] Klepeis, J.L., Floudas, C.A.: Ab initio prediction of helical segments in polypeptides. J. Comput. Chem. 23(2), 245–266 (2002) · Zbl 05428635
[39] Klepeis, J.L., Floudas, C.A.: Prediction of beta-sheet topology and disulfide bridges in polypeptides. J. Comput. Chem. 24, 191–208 (2003) · Zbl 05428637
[40] Klepeis, J.L., Floudas, C.A.: Ab initio tertiary structure prediction of proteins. J. Glob. Optim. 25, 113–140 (2003) · Zbl 1045.92018
[41] Klepeis, J.L., Pieja, M.T., Floudas, C.A.: A new class of hybrid global optimization algorithms for peptide structure prediction: Integrated hybrids. Comput. Phys. Commun. 151, 121–140 (2003) · Zbl 1196.90134
[42] Klepeis, J.L., Pieja, M.T., Floudas, C.A.: Hybrid global optimization algorithms for protein structure prediction: Alternating hybrids. Biophys. J. 84, 869–882 (2003) · Zbl 1196.90134
[43] Klepeis, J.L., Wei, Y.N., Hecht, M.H., Floudas, C.A.: Ab initio prediction of the three-dimensional structure of a de novo designed protein: A double-blind case study. Proteins Struct. Funct. Bioinf. 58, 560–570 (2005)
[44] Dunbrack, R.L.: Sequence comparison and protein structure prediction. Curr. Opin. Struct. Biol. 16, 374–384 (2006)
[45] Bujnicki, J.M.: Protein structure prediction by recombination of fragments. Chem. Bio. Chem. 7, 19–27 (2006)
[46] Floudas, C.A., Fung, H.K., McAllister, S.R., Mönnigmann, M., Rajgaria, R.: Advances in protein structure prediction and de novo protein design: A review. Chem. Eng. Sci. 61, 966–988 (2006)
[47] Floudas, C.A.: Computational methods in protein structure prediction. Biotechnol. Bioeng. 97, 207–213 (2007)
[48] Cornell, W.D., Cieplak, P., Bayly, C.I., Gould, I.R., Merz, K.M. Jr., Ferguson, D.M., Spellmeyer, D.C., Fox, T., Caldwell, J.W., Kollman, P.A.: A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J. Am. Chem. Soc. 117, 5179–5197 (1995)
[49] MacKerell, A.D. Jr., Bashford, D., Bellott, M., Dunbrack, R.L. Jr., Evanseck, J.D., Field, M.J., Fischer, S., Gao, J., Guo, H., Ha, S., Joseph-McCarthy, D., Kuchnir, L., Kuczera, K., Lau, F.T.K., Mattos, C., Michnick, S., Ngo, T., Nguyen, D.T., Prodhom, B., Reiher, W.E. III, Roux, B., Schlenkrich, M., Smith, J.C., Stote, R., Straub, J., Watanabe, M., Wiórkiewicz-Kuczera, J., Yin, D., Karplus, M.: All-atom empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem. B 102, 3586–3616 (1998)
[50] Momany, F.A., McGuire, R.F., Burgess, A.W., Scheraga, H.A.: Energy parameters in polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occurring amino acids. J. Phys. Chem. 79, 2361–2381 (1975)
[51] Némethy, G., Gibson, K.D., Palmer, K.A., Yoon, C.N., Paterlini, G., Zagari, A., Rumsey, S., Scheraga, H.A.: Energy parameters in polypeptides. 10. Improved geometrical parameters and nonbonded interactions for use in the ECEPP/3 algorithm, with application to proline-containing peptides. J. Phys. Chem. 96, 6472–6484 (1992)
[52] Arnautova, Y.A., Jagielska, A., Scheraga, H.A.: A new force field (ECEPP-05) for peptides, proteins, and organic molecules. J. Phys. Chem. B 110, 5025–5044 (2006)
[53] Ortiz, A.R., Kolinski, A., Skolnick, J.: Fold assembly of small proteins using Monte Carlo simulations driven by restraints derived from multiple sequence alignments. J. Mol. Biol. 277(2), 419–448 (1998)
[54] McAllister, S.R., Mickus, B.E., Klepeis, J.L., Floudas, C.A.: A novel approach for alpha-helical topology prediction in globular proteins: Generation of interhelical restraints. Proteins Struct. Funct. Bioinf. 65, 930–952 (2006)
[55] Klepeis, J.L., Floudas, C.A.: Analysis and prediction of loop segments in protein structures. Comput. Chem. Eng. 29, 423–436 (2005)
[56] Mönnigmann, M., Floudas, C.A.: Protein loop structure prediction with flexible stem geometries. Proteins Struct. Funct. Bioinf. 61, 748–762 (2005)
[57] Creighton, T.E.: Proteins: Structures and Molecular Properties, 2nd edn. Freeman, New York (1993)
[58] Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms, 2nd edn. Wiley, New York (1993) · Zbl 0774.90075
[59] Fletcher, R.: Practical Methods of Optimization, 2nd edn. Wiley, New York (1987) · Zbl 0905.65002
[60] Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, Burlington (1981)
[61] Gill, P.E., Murray, W., Saunders, M., Wright, M.H.: NPSOL 4.0 User’s Guide. Systems Optimization Laboratory, Department of Operations Research, Standford University, CA (1986)
[62] Blumenthal, L.M.: Theory and Applications of Distance Geometry. Cambridge University Press, Cambridge (1953) · Zbl 0050.38502
[63] Crippen, G.M.: A novel approach to the calculation of conformation: Distance geometry. J. Comput. Phys. 26, 449–452 (1977) · Zbl 0373.15007
[64] Crippen, G.M., Havel, T.F.: Distance Geometry and Molecular Conformation. Wiley, New York (1988) · Zbl 1066.51500
[65] Moré, J.J., Wu, Z.: Distance geometry optimization for protein structures. J. Glob. Optim. 15, 219–234 (1999) · Zbl 0944.92012
[66] Allen, M.P., Tildesley, D.J.: Computer Simulation of Liquids. Clarendon Press, Oxford (1987) · Zbl 0703.68099
[67] Brünger, A.T.: X-PLOR, Version 3.1. A System for X-Ray Crystallography and NMR. Yale University Press, New Haven (1992)
[68] Braun, W., Go, N.: Calculation of protein conformations by proton-proton distance constraints. A new efficient algorithm. J. Mol. Biol. 186, 611–626 (1985)
[69] Güntert, P., Wüthrich, K.: Improved efficiency of protein structure calculations from NMR data using the program DIANA with redundant dihedral angle constraints. J. Biomol. NMR 1, 446–456 (1991)
[70] Jain, A., Vaidehi, N., Rodriguez, G.: A fast recursive algorithm for molecular dynamics simulation. J. Comput. Phys. 106, 258–268 (1993) · Zbl 0777.65038
[71] Güntert, P., Mumenthaler, C., Wüthrich, K.: Torsion angle dynamics for NMR structure calculation with the new program DYANA. J. Mol. Biol. 273, 283–298 (1997)
[72] Stein, E.G., Rice, L.M., Brünger, A.T.: Torsion angle dynamics as a new efficient tool for NMR structure calculation. J. Magn. Reson. 124, 154–164 (1997)
[73] Güntert, P.: Structure calculation of biological macromolecules from NMR data. Q. Rev. Biophys. 31, 145–237 (1998)
[74] Ponder, J.W., Richard, F.M.: Tertiary templates for proteins. use of packing criteria in the enumeration of allowed sequences for different structural classes. J. Mol. Biol. 193, 775–791 (1987)
[75] Dunbrack, R.L., Karplus, M.: Backbone-dependent rotamer library for proteins. Application to side-chain prediction. J. Mol. Biol. 230, 543–574 (1993)
[76] Lovell, S.C., Word, J.M., Richardson, J.S., Richardson, D.C.: The penultimate rotamer library. Proteins Struct. Funct. Gen. 40, 389–408 (2000)
[77] Dunbrack, R.L.: Rotamer libraries in the 21st century. Curr. Opin. Struct. Biol. 12, 431–440 (2002)
[78] Desmet, J., De Maeyer, M., Hazes, B., Lasters, I.: The dead-end elimination theorem and its use in protein sidechain positioning. Nature 356, 539–542 (1992)
[79] Looger, L.L., Hellinga, H.W.: Generalized dead-end elimination algorithms make large-scale protein side-chain prediction tractable: Implications for protein design and structural genomics. J. Mol. Biol. 307, 429–445 (2001)
[80] Leach, A.R., Lemon, A.P.: Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins Struct. Funct. Gen. 33, 227–239 (1998)
[81] Xie, W., Sahinidis, N.V.: Residue-rotamer-reduction algorithm for the protein side-chain conformation problem. Bioinformatics 22, 188–194 (2006) · Zbl 05325777
[82] Eriksson, O., Zhou, Y., Elofsson, A.: Side chain-positioning as an integer programming problem. In: WABI ’01: Proceedings of the First International Workshop on Algorithms in Bioinformatics, pp. 128–141 (2001) · Zbl 1128.90563
[83] Kingsford, C.L., Chazelle, B., Singh, M.: Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics 21, 1028–1036 (2005)
[84] Canutescu, A.A., Shelenkov, A.A., Dunbrack, R.L.: A graph-theory algorithm for rapid protein side-chain prediction. Proteome Sci. 12, 2001–2014 (2003)
[85] Holm, L., Sander, C.: Fast and simple Monte Carlo algorithm for side-chain optimization in proteins: Application to model building by homology. Proteins Struct. Funct. Gen. 14, 213–223 (1992)
[86] Liang, S., Grishin, N.V.: Side-chain modeling with an optimized scoring function. Proteome Sci. 11, 322–331 (2002)
[87] Xiang, Z., Honig, B.: Extending the accuracy limits of prediction for side-chain conformations. J. Mol. Biol. 311, 421–430 (2001)
[88] Tufféry, P., Etchebest, C., Hazout, S., Lavery, R.: A new approach to the rapid determination of protein side chain conformations. J. Biomol. Struct. Dyn. 8, 1267–1289 (1991)
[89] Lee, C.: Predicting protein mutant energetics by self-consistent ensemble optimization. J. Mol. Biol. 236, 918–939 (1994)
[90] Desmet, J., Spriet, J., Lasters, I.: Fast and accurate side-chain topology and energy refinement as a new method for protein structure optimization. Proteins Struct. Funct. Gen. 48, 31–43 (2002)
[91] Levitt, M., Gerstein, M., Huang, E., Subbiah, S., Tsai, J.: Protein folding: The endgame. Annu. Rev. Biochem. 66, 549–579 (1997)
[92] Baker, D.: Prediction and design of macromolecular structures and interactions. Philos. Trans. R. Soc. B 361, 459–463 (2006)
[93] Adjiman, C.S., Androulakis, I.P., Maranas, C.D., Floudas, C.A.: A global optimization method, {\(\alpha\)}BB, for process design. Comput. Chem. Eng. 20, S419–S424 (1996)
[94] Adjiman, C.S., Androulakis, I.P., Floudas, C.A.: Global optimization of MINLP problems in process synthesis and design. Comput. Chem. Eng. 21, S445–S450 (1997)
[95] Adjiman, C.S., Dallwig, S., Floudas, C.A., Neumaier, A.: A global optimization method for general twice-differentiable NLPs. i. Theoretical advances. Comput. Chem. Eng. 22, 1137–1158 (1998)
[96] Adjiman, C.S., Androulakis, I.P., Floudas, C.A.: A global optimization method for general twice-differentiable NLPs. ii. Implementation and computational results. Comput. Chem. Eng. 22, 1159–1179 (1998)
[97] Androulakis, I.P., Maranas, C.D., Floudas, C.A.: {\(\alpha\)}BB: A global optimization method for general constrained nonconvex problems. J. Glob. Optim. 7, 337–363 (1995) · Zbl 0846.90087
[98] Floudas, C.A.: Deterministic Global Optimization: Theory, Methods and Applications. Nonconvex Optimization and its Applications. Kluwer Academic, Dordrecht (2000)
[99] Lee, J., Scheraga, H.A., Rackovsky, S.: Conformational analysis of the 20-residue membrane-bound portion of melittin by conformational space annealing. Biopolymers 46, 103–115 (1998)
[100] Lee, J., Scheraga, H.A.: Conformational space annealing by parallel computations: Extensive conformational search of met-enkephalin and the 20-residue membrane-bound portion of melittin. Int. J. Quant. Chem. 75, 255–265 (1999)
[101] Ripoll, D., Liwo, A., Scheraga, H.A.: New developments of the electrostatically driven Monte Carlo method: Tests on the membrane-bound portion of melittin. Biopolymers 46, 117–126 (1998)
[102] Kirkpatrick, S., Gelatt, C.D. Jr., Vecchi, M.P.: Optimization by simulated annealing. Science 220, 671–680 (1983) · Zbl 1225.90162
[103] Gronenborn, A.M., Filpula, D.R., Essig, N.Z., Achari, A., Whitlow, M., Wingfield, P.T., Clore, G.M.: A novel, highly stable fold of the immunoglobulin binding domain of streptococcal protein G. Science 253, 657–661 (1991)
[104] Gallagher, T., Alexander, P., Bryan, P., Gilliland, G.L.: Two crystal structures of the B1 immunoglobulin-binding domain of streptococcal protein G and comparison with NMR. Biochem. 33, 4721–4729 (1994)
[105] Kambach, C., Walke, S., Young, R., Avis, J.M., de la Fortelle, E., Raker, V.A., Lührmann, R., Li, J., Nagai, K.: Crystal structures of two Sm protein complexes and their implications for the assembly of the spliceosomal snRNPs. Cell 96, 375–387 (1999)
[106] Deisenhofer, J., Steigemann, W.: Crystallographic refinement of structure of bovine pancreatic trypsin-inhibitor at 1.5 Å resolution. Acta Crystallogr. Sect. B 31, 238–250 (1975)
[107] Wlodawer, A., Walter, J., Huber, R., Sjolin, L.: Structure of bovine pancreatic trypsin-inhibitor: Results of joint neutron and x-ray refinement of crystal form II. J. Mol. Biol. 180, 301–329 (1984)
[108] Campos-Olivas, R., Hörr, I., Bormann, C., Jung, G., Gronenborn, A.M.: Solution structure, backbone dynamics and chitin binding properties of the anti-fungal protein from Streptomyces tendae TÜ901. J. Mol. Biol. 308, 765–782 (2001)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.