×

Information-theoretic uncertainty of SCFG-modeled folding space of the non-coding RNA. (English) Zbl 1406.92466

Summary: RNA secondary structure ensembles define probability distributions for alternative equilibrium secondary structures of an RNA sequence. Shannon’s entropy is a measure for the amount of diversity present in any ensemble. In this work, Shannon’s entropy of the SCFG ensemble on an RNA sequence is derived and implemented in polynomial time for both structurally ambiguous and unambiguous grammars. Micro RNA sequences generally have low folding entropy, as previously discovered. Surprisingly, signs of significantly high folding entropy were observed in certain ncRNA families. More effective models coupled with targeted randomization tests can lead to a better insight into folding features of these families. Availability: http://www.plantbio.uga.edu/~russell/index.php?s=1&n=5&r=0.

MSC:

92D20 Protein sequences, DNA sequences
94A17 Measures of information, entropy
94A15 Information theory (general)

Software:

Mfold; GenRGenS
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Adami, C.; Ofria, C.; Collier, T. C., Evolution of biological complexity, Proc. Natl. Acad. Sci., 97, 4463-4468 (2000)
[2] Altschul, S. F.; Erickson, B. W., Significance of nucleotide sequence alignmentsa method for random sequence permutation that preserves dinucleotide and codon usage, Mol. Biol. Evol., 2, 526-538 (1985)
[3] Barash, D.; Sikorski, J.; Perry, E. B.; Nevo, E.; Nudler, E., Adaptive mutations in RNA-based regulatory mechanismscomputational and experimental investigations, Israel J. Ecol. Evol., 52 (2006)
[4] Barrandon, C.; Spiluttini, B.; Bensaude, O., Non-coding RNAs regulating the transcriptional machiner, Biol. Cell, 100, 83-95 (2008)
[5] Batey, R. T.; Rambo, R. P.; Doudna, J. A., Tertiary motifs in RNA structure and folding, Angew. Chem., Int. Ed. Engl., 38, 2326-2343 (1999)
[6] Bernauer, J.; Huang, X.; Sim, A. Y.; Levitt, M., Fully differentiable coarse-grained and all-atom knowledge-based potentials for RNA structure evaluation, RNA, 17, 1066-1075 (2011)
[7] Bocobza, S.; Adato, A.; Mandel, T.; Shapira, M.; Nudler, E.; Aharoni, A., Riboswitch-dependent gene regulation and its evolution in the plant kingdom, Genes Dev., 21, 2874-2879 (2007)
[8] Brannvall, M.; Mattsson, J. G.; Svard, S. G.; Kirsebom, L. A., RNase p RNA structure and cleavage reflect the primary structure of tRNA genes, J. Mol. Biol., 283, 771-783 (1998)
[9] Cech, T. R.; Damberger, S. H.; Gutell, R. R., Representation of the secondary and tertiary structure of group I introns, Nat. Struct. Biol., 1, 273-280 (1994)
[10] Chan, C. Y.; Ding, Y., Boltzmann ensemble features of RNA secondary structuresa comparative analysis of biological RNA sequences and random shuffles, J. Math. Biol., 56, 93-105 (2008) · Zbl 1143.92008
[11] Clote, P.; Ferre, F.; Kranakis, E.; Krizanc, D., Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency, RNA, 11, 578-591 (2005)
[12] Cover, T. M.; Thomas, J. A., Elements of Information Theory (2006), Wiley-Interscience · Zbl 1140.94001
[13] D’Haeseleer, P., What are DNA sequence motifs?, Nat. Biotechnol., 24, 423-425 (2006)
[14] Ding, Y.; Lawrence, C. E., A statistical sampling algorithm for RNA secondary structure prediction, Nucleic Acids Res., 31, 7280-7301 (2003)
[15] Do, C. B.; Woods, D. A.; Batzoglou, S., CONTRAfoldRNA secondary structure prediction without physics-based models, Bioinformatics, 22, e90-e98 (2006)
[16] Dowell, R. D.; Eddy, S. R., Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction, BMC Bioinf., 5, 71 (2004)
[17] Du, X.; Wang, E. D., Tertiary structure base pairs between d- and tpsic-loops of escherichia coli tRNA(leu) play important roles in both aminoacylation and editing, Nucleic Acids Res., 31, 2865-2872 (2003)
[18] Durbin, R., Biological Sequence AnalysisProbabilistic Models of Proteins and Nucleic Acids (1998), Cambridge University Press · Zbl 0929.92010
[19] Gardner, P. P.; Daub, J.; Tate, J. G.; Nawrocki, E. P.; Kolbe, D. L.; Lindgreen, S.; Wilkinson, A. C.; Finn, R. D.; Griffiths-Jones, S.; Eddy, S. R.; Bateman, A., Rfamupdates to the RNA families database, Nucleic Acids Res., 37, D136-D140 (2009)
[20] Gardner, P. P.; Wilm, A.; Washietl, S., A benchmark of multiple sequence alignment programs upon structural RNAs, Nucleic Acids Res., 33, 2433-2439 (2005)
[21] Gilbert, S. D.; Love, C. E.; Edwards, A. L.; Batey, R. T., Mutational analysis of the purine riboswitch aptamer domain, Biochemistry, 46 (2007)
[22] Griffiths-Jones, S.; Moxon, S.; Marshall, M.; Khanna, A.; Eddy, S. R.; Bateman, A., Rfamannotating non-coding RNAs in complete genomes, Nucleic Acids Res., 33, D121-D124 (2005)
[23] Guerrier-Takada, C.; Altman, S., A physical assay for and kinetic analysis of the interactions between m1 RNA and tRNA precursor substrates, Biochemistry, 32, 7152-7161 (1993)
[24] Hall, M. N.; Gabay, J.; Debarbouille, M.; Schwartz, M., A role for mRNA secondary structure in the control of translation initiation, Nature, 295, 616-618 (1982)
[25] Hofacker, I. L., Vienna RNA secondary structure server, Nucleic Acids Res., 31, 3429-3431 (2003)
[26] Huynen, M.; Gutell, R.; Konings, D., Assessing the reliability of RNA folding using statistical mechanics, J. Mol. Biol., 267, 1104-1112 (1997)
[27] Kazantsev, A. V.; Pace, N. R., Bacterial RNase pa new view of an ancient enzyme, Nat. Rev. Microbiol., 4, 729-740 (2006)
[28] Knudsen, B.; Hein, J., RNA secondary structure prediction using stochastic context-free grammars and evolutionary history, Bioinformatics, 15, 446-454 (1999)
[29] Knudsen, B.; Hein, J., PfoldRNA secondary structure prediction using stochastic context-free grammars, Nuclic Acids Res., 31, 3423-3428 (2003)
[30] Lu, Y.; Turner, R. J.; Switzer, R. L., Function of RNA secondary structures in transcriptional attenuation of the bacillus subtilis pyr operon, Proc. Natl. Acad. Sci., 93, 14462-14467 (1996)
[31] Mathews, D. H., Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization, RNA, 10, 1178-1190 (2004)
[32] McCaskill, J. S., The equilibrium partition function and base pair binding probabilities for RNA secondary structure, Biopolymers, 29, 1105-1119 (1990)
[33] Miklos, I.; Meyer, I. M.; Nagy, B., Moments of the Boltzmann distribution for RNA secondary structures, Bull. Math. Biol., 67, 1031-1047 (2005) · Zbl 1334.92312
[34] Morris, K. V., RNA and the Regulation of Gene ExpressionA Hidden Layer of Complexity (2008), Caister Academic Press
[35] Morris, K. V., Non-coding RNAs and Epigenetic Regulation of Gene ExpressionDrivers of Natural Selection (2012), Caister Academic Press
[36] Niranjanakumari, S.; Stams, T.; Crary, S. M.; Christianson, D. W.; Fierke, C. A., Protein component of the ribozyme ribonuclease p alters substrate recognition by directly contacting precursor tRNA, Proc. Natl. Acad. Sci. U.S.A., 95, 15212-15217 (1998)
[37] Ponty, Y.; Termier, M.; Denise, A., GenRgenSsoftware for generating random genomic sequences and structures, Bioinformatics, 22, 1534-1535 (2006)
[38] Quarta, G.; Kim, N.; Izzo, J. A.; Schlick, T., Analysis of riboswitch structure and function by an energy landscape framework, J. Mol. Biol., 393, 993-1003 (2009)
[39] Repoila, F.; Darfeuille, F., Small regulatory non-coding RNAs in bacteriaphysiology and mechanistic aspects, Biol. Cell, 101, 117-131 (2009)
[40] Scarabino, D.; Crisari, A.; Lorenzini, S.; Williams, K.; Tocchini-Valentini, G. P., tRNA prefers to kiss, EMBO J., 18, 4571-4578 (1999)
[41] Schneider, T.; Stephens, R., Sequence logosa new way to display consensus sequences, Nucleic Acids Res., 18, 6097-6100 (1990)
[42] Shannon, C., A mathematical theory of communication, Bell Syst. Tech. J., 27, 379-423 (1948) · Zbl 1154.94303
[43] Shaw, T. I.; Manzour, A.; Wang, Y.; Malmberg, R. L.; Cai, L., Analyzing modular RNA structure reveals low global structural entropy in microRNA sequences, J. Bioinf. Comput. Biol., 9, 283-298 (2011)
[44] Simmonds, P.; Karakasiliotis, I.; Bailey, D.; Chaudhry, Y.; Evans, D. J.; Goodfellow, I. G., Bioinformatic and functional analysis of RNA secondary structure elements among different genera of human and animal caliciviruses, Nucleic Acids Res., 36, 2530-2546 (2008)
[45] Taft, R. J.; Pang, K. C.; Mercer, T. R.; Dinger, M.; Mattick, J. S., Non-coding RNAsregulators of disease, J. Pathol., 220, 13-126 (2010)
[46] Tinoco, I.; Bustamante, C., How RNA folds, J. Mol. Biol., 293, 271-281 (1999)
[47] Vitreschak, A. G.; Rodionov, D. A.; Mironov, A. A.; Gelfand, M. S., Riboswitchesthe oldest mechanism for the regulation of gene expression, Trends Genet., 20, 44-50 (2004)
[48] Wang, Y.; Manzour, A.; Shareghi, P.; Shaw, T. I.; Li, Y. W.; Malmberg, R. L.; Cai, L., Stable stem enabled shannon entropies distinguish non-coding RNAs from random backgrounds, BMC Bioinf., 13, 5, S1 (2012)
[49] Westhof, E.; Masquida, B.; Jossinet, F., Predicting and modeling RNA architecture, CSH Perspect., 3 (2011)
[50] Yockey, H. P., Information Theory, Evolution, and the Origin of Life (2005), Cambridge University Press · Zbl 1160.92331
[51] Zuker, M., Mfold web server for nucleic acid folding and hybridization prediction, Nucleic Acids Res., 31, 3406-3415 (2003)
[52] Zuker, M.; Stiegler, P., Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information, Nucleic Acids Res., 9, 133-148 (1981)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.