RNA folding kinetics using Monte Carlo and Gillespie algorithms.

*(English)*Zbl 1392.92066Summary: RNA secondary structure folding kinetics is known to be important for the biological function of certain processes, such as the hok/sok system in E. coli. Although linear algebra provides an exact computational solution of secondary structure folding kinetics with respect to the Turner energy model for tiny (\(\approx 20\) nt) RNA sequences, the folding kinetics for larger sequences can only be approximated by binning structures into macrostates in a coarse-grained model, or by repeatedly simulating secondary structure folding with either the Monte Carlo algorithm or the Gillespie algorithm. Here we investigate the relation between the Monte Carlo algorithm and the Gillespie algorithm. We prove that asymptotically, the expected time for a \(K\)-step trajectory of the Monte Carlo algorithm is equal to \(\langle N \rangle \) times that of the Gillespie algorithm, where \(\langle N \rangle \) denotes the Boltzmann expected network degree. If the network is regular (i.e. every node has the same degree), then the mean first passage time (MFPT) computed by the Monte Carlo algorithm is equal to MFPT computed by the Gillespie algorithm multiplied by \(\langle N \rangle \); however, this is not true for non-regular networks. In particular, RNA secondary structure folding kinetics, as computed by the Monte Carlo algorithm, is not equal to the folding kinetics, as computed by the Gillespie algorithm, although the mean first passage times are roughly correlated. Simulation software for RNA secondary structure folding according to the Monte Carlo and Gillespie algorithms is publicly available, as is our software to compute the expected degree of the network of secondary structures of a given RNA sequence – see http://bioinformatics.bc.edu/clote/RNAexpNumNbors.

##### MSC:

92D20 | Protein sequences, DNA sequences |

60J22 | Computational methods in Markov chains |

68N19 | Other programming paradigms (object-oriented, sequential, concurrent, automatic, etc.) |

65C05 | Monte Carlo methods |

##### Keywords:

RNA secondary structure; mean first passage time; refolding kinetics; Metropolis algorithm; Gillespie algorithm; mean recurrence time; expected network degree
PDF
BibTeX
XML
Cite

\textit{P. Clote} and \textit{A. H. Bayegan}, J. Math. Biol. 76, No. 5, 1195--1227 (2018; Zbl 1392.92066)

Full Text:
DOI

##### References:

[1] | Abkevich, VI; Gutin, AM; Shakhnovich, EI, How the first biopolymers could have evolved, Proc Natl Acad Sci USA, 93, 839-844, (1996) |

[2] | Abkevich VI, Gutin AM, Shakhnovich EI (1997) Computer simulations of prebiotic evolution. Pac Symp Biocomput 27-38 |

[3] | Anfinsen, CB, Principles that govern the folding of protein chains, Science, 181, 223-230, (1973) |

[4] | Aviram, I; Veltman, I; Churkin, A; Barash, D, Efficient procedures for the numerical simulation of mid-size RNA kinetics, Algorithms Mol Biol, 7, 24, (2012) |

[5] | Bowman, GR, A tutorial on building Markov state models with msmbuilder and coarse-graining them with BACE, Methods Mol Biol, 141-158, 2014, (1084) |

[6] | Bowman, GR; Huang, X; Pande, VS, Using generalized ensemble simulations and Markov state models to identify conformational states, Methods, 49, 197-201, (2009) |

[7] | Bowman, GR; Pande, VS, Protein folded states are kinetic hubs, Proc Natl Acad Sci USA, 107, 10890-10895, (2010) |

[8] | Bryngelson, JD; Onuchic, JN; Socci, ND; Wolynes, PG, Funnels, pathways, and the energy landscape of protein folding: a synthesis, Proteins, 21, 167-195, (1995) |

[9] | Choi, HM; Beck, VA; Pierce, NA, Multiplexed in situ hybridization using hybridization chain reaction, Zebrafish, 11, 488-489, (2014) |

[10] | Clote, P, Expected degree for RNA secondary structure networks, J Comput Chem, 36, 103-17, (2015) |

[11] | Clote P, Backofen R (2000) Computational molecular biology: an introduction. Wiley, Hoboken · Zbl 0955.92013 |

[12] | Clote, P; Bayegan, A, Network properties of the ensemble of RNA structures, PLoS ONE, 10, e0139476, (2015) |

[13] | Dotu, I; Garcia-Martin, JA; Slinger, BL; Mechery, V; Meyer, MM; Clote, P, Complete RNA inverse folding: computational design of functional hammerhead ribozymes, Nucleic Acids Res, 42, 11752-11762, (2015) |

[14] | Dykeman, EC, An implementation of the Gillespie algorithm for RNA kinetics with logarithmic time update, Nucleic Acids Res, 43, 5708-5715, (2015) |

[15] | Esmaili-Taheri, A; Ganjtabesh, M, ERD: a fast and reliable tool for RNA design including constraints, BMC Bioinform, 16, 20, (2015) |

[16] | Feller W (1968) An introduction to probability theory and its applications, vol 1, 3rd edn. Wiley, New York · Zbl 0155.23101 |

[17] | Flamm, C; Fontana, W; Hofacker, IL; Schuster, P, RNA folding at elementary step resolution, RNA, 6, 325-338, (2000) |

[18] | Fusy E, Clote P (2012) Combinatorics of locally optimal RNA secondary structures. J Math Biol 68(1-2):341-75 · Zbl 1288.05018 |

[19] | Garcia-Martin, JA; Clote, P; Dotu, I, Rnaifold: A constraint programming algorithm for RNA inverse folding and molecular design, J Bioinform Comput Biol, 11, 1350001, (2013) |

[20] | Gardner, PP; Daub, J; Tate, J; Moore, BL; Osuch, IH; Griffiths-Jones, S; Finn, RD; Nawrocki, EP; Kolbe, DL; Eddy, SR; Bateman, A, Rfam: wikipedia, clans and the “decimal” release, Nucleic Acids Res, 39, d141-d145, (2011) |

[21] | Geis, M; Flamm, C; Wolfinger, MT; Tanzer, A; Hofacker, IL; Middendorf, M; Mandl, C; Stadler, PF; Thurner, C, Folding kinetics of large rnas, J Mol Biol, 379, 160-173, (2008) |

[22] | Gillespie, DT, A general method for numerically simulating the stochastic time evolution of coupled chemical reactions, J Comput Phys, 22, 403-434, (1976) |

[23] | Harrigan, MP; Sultan, MM; Hernandez, CX; Husic, BE; Eastman, P; Schwantes, CR; Beauchamp, KA; McGibbon, RT; Pande, VS, Msmbuilder: statistical models for biomolecular dynamics, Biophys J, 112, 10-15, (2017) |

[24] | Hastings, WK, Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57, 97-109, (1970) · Zbl 0219.65008 |

[25] | Huang X, Yao Y, Bowman GR, Sun J, Guibas LJ, Carlsson G, Pande VS (2010) Constructing multi-resolution Markov State Models (MSMs) to elucidate RNA hairpin folding mechanisms. Pac Symp Biocomput 228-239 |

[26] | Huynen, M; Gutell, R; Konings, D, Assessing the reliability of RNA folding using statistical mechanics, J Mol Biol, 267, 1104-1112, (1997) |

[27] | Tinoco, I; Schmitz, M; Cera, E (ed.), Thermodynamics of formation of secondary structure in nucleic acids, 131-176, (2000), Oxford |

[28] | Isambert, H; Siggia, ED, Modeling RNA folding paths with pseudoknots: application to hepatitis \(delta\) virus ribozyme, Proc Natl Acad Sci USA, 97, 6515-6520, (2000) |

[29] | Juhling, F; Morl, M; Hartmann, RK; Sprinzl, M; Stadler, PF; Putz, J, Trnadb 2009: compilation of trna sequences and trna genes, Nucleic Acids Res, 37, d159-d162, (2009) |

[30] | Klemm K, Flamm C, Stadler PF (2008) Funnels in energy landscapes. Eur Phys J B 63(3):387-391 · Zbl 1189.60137 |

[31] | Levinthal, C, Are there pathways for protein folding?, Journal de Chimie Physique, 65, 44-45, (1968) |

[32] | Levy, RM; Dai, W; Deng, NJ; Makarov, DE, How long does it take to equilibrate the unfolded state of a protein?, Protein Sci, 22, 1459-1465, (2013) |

[33] | Lorenz, R; Bernhart, SH; Höner zu Siederdissen, C; Tafer, H; Flamm, C; Stadler, PF; Hofacker, IL, Viennarna package 2.0., Algorithms Mol Biol, 6, 26, (2011) |

[34] | Metropolis, N; Rosenbluth, AW; Rosenbluth, MN; Teller, AH; Teller, E, Equation of state calculations by fast computing machines, J Chem Phys, 21, 1087-1092, (1953) |

[35] | Meyer, CD, The role of the group inverse in the theory of finite Markov chains, SIAM Rev, 17, 443-464, (1975) · Zbl 0313.60044 |

[36] | Moulton, V; Zuker, M; Steel, M; Pointon, R; Penny, D, Metrics on RNA secondary structures, J Comput Biol, 7, 277-292, (2000) |

[37] | Nawrocki, EP; Burge, SW; Bateman, A; Daub, J; Eberhardt, RY; Eddy, SR; Floden, EW; Gardner, PP; Jones, TA; Tate, J; Finn, RD, Rfam 12.0: updates to the RNA families database, Nucleic Acids Res, 43, d130-d137, (2015) |

[38] | Nussinov, R; Jacobson, AB, Fast algorithm for predicting the secondary structure of single stranded RNA, Proc Natl Acad Sci USA, 77, 6309-6313, (1980) |

[39] | Pande, VS; Beauchamp, K; Bowman, GR, Everything you wanted to know about Markov state models but were afraid to ask, Methods, 52, 99-105, (2010) |

[40] | Šali, A; Shakhnovich, E; Karplus, M, How does a protein fold?, Nature, 369, 248-251, (1994) |

[41] | Šali, A; Shakhnovich, E; Karplus, M, Kinetics of protein folding: a lattice model study of the requirements for folding to the native state, J Mol Biol, 235, 1614-1636, (1994) |

[42] | Schmitz, M; Steger, G, Description of RNA folding by “simulated annealing”, J Mol Biol, 255, 254-266, (1996) |

[43] | Senter, E; Clote, P, Fast, approximate kinetics of RNA folding, J Comput Biol, 22, 124-144, (2015) · Zbl 1308.92079 |

[44] | Senter, E; Sheikh, S; Dotu, I; Ponty, Y; Clote, P, Using the fast Fourier transform to accelerate the computational search for RNA conformational switches, PLoS ONE, 7, e50506, (2012) |

[45] | Senter, E; Dotu, I; Clote, P, RNA folding pathways and kinetics using 2D energy landscapes, J Math Biol, 70, 173-196, (2015) · Zbl 1308.92079 |

[46] | Stein, PR; Waterman, MS, On some new sequences generalizing the Catalan and Motzkin numbers, Discrete Math, 26, 261-272, (1978) · Zbl 0405.10009 |

[47] | Swope, WC; Pitera, JW; Suits, F, Describing protein folding kinetics by molecular dynamics simulations. 1. theory, J Phys Chem B, 108, 6571-6581, (2010) |

[48] | Taneda, A, MODENA: a multi-objective RNA inverse folding, Adv Appl Bioinform Chem, 4, 1-12, (2011) |

[49] | Tang X, Thomas S, Tapia L, Amato NM (2007) Tools for simulating and analyzing RNA folding kinetics.In: Speed T, Huang H (eds) Research in computational molecular biology. RECOMB 2007. Lecture notes in computer science, vol 4453. Springer, Berlin, Heidelberg, pp 268-282 |

[50] | Tang, X; Thomas, S; Tapia, L; Giedroc, DP; Amato, NM, Simulating RNA folding kinetics on approximated energy landscapes, J Mol Biol, 381, 1055-1067, (2008) |

[51] | Turner, DH; Mathews, DH, NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure, Nucleic Acids Res, 38, d280-d282, (2010) |

[52] | Weber, JK; Pande, VS, Characterization and rapid sampling of protein folding Markov state model topologies, J Chem Theory Comput, 7, 3405-3411, (2011) |

[53] | Wolfinger, M; Svrcek-Seiler, WA; Flamm, C; Stadler, PF, Efficient computation of RNA folding dynamics, J Phys A Math Gen, 37, 4731-4741, (2004) · Zbl 1050.81729 |

[54] | Wolfinger, MT; Svrcek-Seiler, WA; Flamm, C; Hofacker, IL; Stadler, PF, Efficient folding dynamics of RNA secondary structures, J Phys A Math Gen, 37, 4731-4741, (2004) · Zbl 1050.81729 |

[55] | Wuchty, S; Fontana, W; Hofacker, IL; Schuster, P, Complete suboptimal folding of RNA and the stability of secondary structures, Biopolymers, 49, 145-165, (1999) |

[56] | Xu, X; Yu, T; Chen, SJ, Understanding the kinetic mechanism of RNA single base pair formation, Proc Natl Acad Sci USA, 113, 116-121, (2016) |

[57] | Zadeh, JN; Wolfe, BR; Pierce, NA, Nucleic acid sequence design via efficient ensemble defect optimization, J Comput Chem, 32, 439-452, (2011) |

[58] | Zhang, W; Chen, SJ, RNA hairpin-folding kinetics, Proc Natl Acad Sci USA, 99, 1931-1936, (2002) |

[59] | Zuker, M; Stiegler, P, Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information, Nucleic Acids Res, 9, 133-148, (1981) |

[60] | Zuker, M; Sankoff, D, RNA secondary structures and their prediction, Bull Biol, 46, 591-621, (1984) · Zbl 0548.92007 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.