A model for sequential evolution of ligands by exponential enrichment (SELEX) data. (English) Zbl 1254.92025

Summary: A Systematic Evolution of Ligands by EXponential enrichment (SELEX) experiments begins in round one with a random pool of oligonucleotides in an equilibrium solution with a target. Over a few rounds, oligonucleotides having a high affinity for the target are selected. Data from a high throughput SELEX experiments consist of lists of thousands of oligonucleotides sampled after each round. Thus far, SELEX experiments have been very good at suggesting the highest affinity oligonucleotides, but modeling lower affinity recognition site variants has been difficult. Furthermore, an alignment step has always been used prior to analyzing SELEX data. We present a novel model, based on a biochemical parametrization of SELEX, which allows us to use data from all rounds to estimate the affinities of the oligonucleotides. Most notably, our model also aligns the oligonucleotides. We use our model to analyze a SELEX experiment containing double stranded DNA oligonucleotides and the transcription factor Bicoid as the target. Our SELEX model outperformed other published methods for predicting putative binding sites for Bicoid as indicated by the results of an in-vivo ChIP-chip experiment.


92C40 Biochemistry, molecular biology
62P10 Applications of statistics to biology and medical sciences; meta analysis
65C20 Probabilistic models, generic numerical methods in probability and statistics


Full Text: DOI arXiv Euclid


[1] Atherton, J., Boley, N., Brown, B., Ogawa, N., Davidson, S. M., Eisen, M. B., Biggin, M. D. and Bickel, P. (2012). Supplement to “A model for sequential evolution of ligands by exponential enrichment (SELEX) data.” . · Zbl 1254.92025
[2] Atkins, P. (1998). Physical Chemistry . Freeman, New York.
[3] Ay, A. and Arnosti, D. N. (2011). Mathematical modeling of gene expression: A guide for the perplexed biologist. Crit. Rev. Biochem. Mol. Biol. 46 137-151.
[4] Bailey, T. L., Williams, N., Misleh, C. and Li, W. W. (2006). MEME: Discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res. 34 369-373.
[5] Berman, B. P., Pfeiffer, B. D., Laverty, T. R., Salzberg, S. L., Rubin, G. M., Eisen, M. B. and Celniker, S. E. (2004). Computational identification of developmental enhancers: Conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura. Genome Biol. 5 R61.
[6] Biggin, M. D. (2011). Animal transcription networks as highly connected, quantitative continua. Dev. Cell 21 611-626.
[7] Boyle, A. P., Song, L., Lee, B. K., London, D., Keefe, D., Birney, E., Iyer, V. R., Crawford, C. E. and Furey, T. S. (2010). High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Research 21 456-464.
[8] Djordjevic, M. (2007). SELEX experiments: New prospects, applications and data analysis in inferring regulatory pathways. Biomol. Eng. 24 179-189.
[9] Djordjevic, M. and Sengupta, A. M. (2006). Quantitative modelling and data analysis of SELEX experiments. Physical Biology 3 13-28.
[10] Ellington, A. D. and Szostak, J. W. (1990). In vitro selection of RNA molecules that bind specific ligands. Nature 346 818-822.
[11] Freede, P. and Brantl, S. (2004). Transcriptional repressor CopR: Use of SELEX to study the copR operator indicates that evolution was directed at maximal binding. Journal of Bacteriology 186 6254-6264.
[12] Guo, K., Paul, A., Schichor, C., Ziemer, G. and Wendel, H. P. (2008). CELL-SELEX: Novel perspectives of aptamer-based therapeutics. International Journal of Molecular Sciences 9 668-678.
[13] Kaplan, T., Li, X. Y., Sabo, P., Peter, J. S., Thomas, S., Stamatoyannopoulos, J. A., Biggin, M. D. and Eisen, M. B. (2011). Quantitative models of the mechanisms that control genome-wide patterns of transcription factor binding during early Drosophila development. PLoS Genetics 7 e1001290.
[14] Kim, S., Shi, H., Lee, D. and Lis, J. T. (2003). Specific SR protein-dependent splicing substrates identified through genomic SELEX. Nuclei Acids Research 31 1955-1961.
[15] Li, X.-Y., MacArthur, S., Bourgon, R., Nix, D., Pollard, D. A., Iyer, V. N., Hechmer, A., Simirenko, L., Stapleton, M., Hendriks, C. L. L., Chu, H. C., Ogawa, N., Inwood, W., Sementchenko, V., Beaton, A., Weiszmann, R., Celniker, S. E., Knowles, D. W., Gingeras, G., Speed, T. P., Eisen, M. B. and Biggin, M. D. (2008). Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm. PLoS Biology 6 e27.
[16] Li, X.-Y., Thomas, S., Sabo, P. J., Eisen, M. B., Stamatoyannopoulos, J. A. and Biggin, M. D. (2011). The role of chromatin accessibility in directing the widespread, overlapping patterns of Drosophila transcription factor binding. Genome Biol. 12 R34.
[17] MacArthur, S., Li, X.-Y., Li, J., Brown, J. B., Chu, H. C., Zeng, L., Grondona, B. P., Hechmer, A., Simirenko, L., Keranen, S. V. E., Knowles, D. W., Stapleton, M., Bickel, P. J., Biggin, M. D. and Eisen, M. B. (2009). Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions. Genome Biology 10 R80.
[18] Nelder, J. A. and Mead, R. (1965). A simplex method for function minimization. The Computer Journal 7 308-313. · Zbl 0229.65053
[19] Ng, E. W. M., Shima, D. T., Calias, P., Cunningham, E. T. J. and Guyer, D. R. (2006). Pegaptanib, a targeted anti-VEGF aptamer for ocular vascular disease. Nature Reviews Drug Discovery 5 123-132.
[20] Nocedal, J. and Wright, S. (2006). Numerical Optimization , 2nd ed. Springer, Berlin. · Zbl 1104.65059
[21] Ogawa, N. and Biggin, M. D. (2011). Gene regulatory networks: Methods and protocols. In High-Throughput SELEX Determination of DNA Sequences Bound by Transcription Factors in vitro (B. Deplancke and N. Gheldof, eds.). Methods in Molecular Biology 786 51-63. Humana Press, Clifton, NJ.
[22] Powell, M. J. D. (1964). An efficient method for finding the minimum of a function of several variables without calculating derivatives. Comput. J. 7 155-162. · Zbl 0132.11702
[23] Ravasi, T., Suzuki, H., Cannistraci, C. V., Katayama, S., Bajic, V. B., Tan, K., Akalin, A., Schmeier, S., Kanamori-Katayama, M., Bertin, N., Carninci, P., Daub, C. O., Forrest, A. R. R., Gough, J., Grimmond, S., Han, J. H., Hashimoto, T., Hide, W., Hofmann, O., Kamburov, A., Kaur, M., Kawaji, H., Kubosaki, A., Lassmann, T., v. Nimwegen, E., MacPherson, C. R., Ogawa, C., Radovanovic, A., Schwartz, A., Teasdale, R. D., Tegnér, J., Lenhard, B., Teichmann, S. A., Arakawa, T., Ninomiya, N., Murakami, Tagami, M., Fukuda, S., Imamura, K., Kai, C., Ishihara, R., Kitazume, Y., Kawai, J., Hume, D. A., Ideker, T. and HayashizakiSee, Y. (2010). An atlas of combinatorial transcription regulation in mouse. Cell 140 744-752.
[24] Segal, E., Sadka, T., Schroeder, M., Unnerstall, U. and Gaul, U. (2006). Predicting expression patterns from regulatory sequence in Drosophila segmentation. Nature 451 535-540.
[25] Sharon, E., Lubliner, S. and Segal, E. (2008). A feature-based approach to modeling protein-DNA interactions. PLoS Comput. Biol. 4 e1000154.
[26] Tuerk, C. and Gold, L. (1990). Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249 505-510.
[27] von Hipple, P. H. (2007). From ‘simple’ DNA-protein interactions to the macromolecular machines of gene expression. Annual Review of Biophysics 36 79-105.
[28] Zhoa, Y., Granas, D. and Stormo, G. D. (2009). Inferring binding energies from selected binding sites. PLoS Comput. Biol. 5 e1000590.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.