zbMATH — the first resource for mathematics

Recent advances in the computational discovery of transcription factor binding sites. (English) Zbl 06920448
Summary: The discovery of gene regulatory elements requires the synergism between computational and experimental techniques in order to reveal the underlying regulatory mechanisms that drive gene expression in response to external cues and signals. Utilizing the large amount of high-throughput experimental data, constantly growing in recent years, researchers have attempted to decipher the patterns which are hidden in the genomic sequences. These patterns, called motifs, are potential binding sites to transcription factors which are hypothesized to be the main regulators of the transcription process. Consequently, precise detection of these elements is required and thus a large number of computational approaches have been developed to support the de novo identification of TFBSs. Even though novel approaches are continuously proposed and almost all have reported some success in yeast and other lower organisms, in higher organisms the problem still remains a challenge. In this paper, we therefore review the recent developments in computational methods for transcription factor binding site prediction. We start with a brief review of the basic approaches for binding site representation and promoter identification, then discuss the techniques to locate physical TFBSs, identify functional binding sites using orthologous information, and infer functional TFBSs within some context defined by additional prior knowledge. Finally, we briefly explore the opportunities for expanding these approaches towards the computational identification of transcriptional regulatory networks.
00 General and overarching topics; collections
Full Text: DOI
[1] Kafatos, F.C.; A revolutionary landscape: the restructuring of biology and its convergence with medicine; J Mol Biol: 2002; Volume 319 ,861-867.
[2] Lemon, B.; Tjian, R.; Orchestrated response: a symphony of transcription factors for gene control; Genes Dev: 2000; Volume 14 ,2551-2569.
[3] Levine, M.; Tjian, R.; Transcription regulation and animal diversity; Nature: 2003; Volume 424 ,147-151.
[4] van Driel, R.; Fransz, P.F.; Verschure, P.J.; The eukaryotic genome: a system regulated at different hierarchical levels; J Cell Sci: 2003; Volume 116 ,4067-4075.
[5] Werner, T.; Fessele, S.; Maier, H.; Nelson, P.J.; Computer modeling of promoter organization as a tool to study transcriptional coregulation; Faseb J: 2003; Volume 17 ,1228-1237.
[6] Cooper, S.J.; Trinklein, N.D.; Anton, E.D.; Nguyen, L.; Myers, R.M.; Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome; Genome Res: 2006; Volume 16 ,1-10.
[7] Maston, G.A.; Evans, S.K.; Green, M.R.; Transcriptional regulatory elements in the human genome; Annu Rev Genomics Hum Genet: 2006; Volume 7 ,29-59.
[8] Heintzman, N.D.; Ren, B.; The gateway to transcription: identifying, characterizing and understanding promoters in the eukaryotic genome; Cell Mol Life Sci: 2007; Volume 64 ,386-400.
[9] Barrera, L.O.; Ren, B.; The transcriptional regulatory code of eukaryotic cells–insights from genome-wide analysis of chromatin organization and transcription factor binding; Curr Opin Cell Biol: 2006; Volume 18 ,291-298.
[10] Dillon, N.; Gene regulation and large-scale chromatin organization in the nucleus; Chromosome Res: 2006; Volume 14 ,117-126.
[11] Mateos-Langerak, J.; Goetze, S.; Leonhardt, H.; Cremer, T.; van Driel, R.; Lanctot, C.; Nuclear architecture: Is it important for genome function and can we prove it?; J Cell Biochem: 2007; Volume 102 ,1067-1075.
[12] Schneider, R.; Grosschedl, R.; Dynamics and interplay of nuclear architecture, genome organization, and gene expression; Genes Dev: 2007; Volume 21 ,3027-3043.
[13] Wray, G.A.; Hahn, M.W.; Abouheif, E.; Balhoff, J.P.; Pizer, M.; Rockman, M.V.; Romano, L.A.; The evolution of transcriptional regulation in eukaryotes; Mol Biol Evol: 2003; Volume 20 ,1377-1419.
[14] Landry, J.R.; Mager, D.L.; Wilhelm, B.T.; Complex controls: the role of alternative promoters in mammalian genomes; Trends Genet: 2003; Volume 19 ,640-648.
[15] Singer, G.A.; Wu, J.; Yan, P.; Plass, C.; Huang, T.H.; Davuluri, R.V.; Genome-wide analysis of alternative promoters of human genes using a custom promoter tiling array; BMC Genomics: 2008; Volume 9 ,349.
[16] Sandve, G.K.; Drablos, F.; A survey of motif discovery methods in an integrated framework; Biol Direct: 2006; Volume 1 ,11.
[17] Bulyk, M.L.; Computational prediction of transcription-factor binding site locations; Genome Biol: 2003; Volume 5 ,201.
[18] Qi, Y.; Rolfe, A.; MacIsaac, K.D.; Gerber, G.K.; Pokholok, D.; Zeitlinger, J.; Danford, T.; Dowell, R.D.; Fraenkel, E.; Jaakkola, T.S.; Young, R.A.; Gifford, D.K.; High-resolution computational models of genome binding events; Nat Biotechnol: 2006; Volume 24 ,963-970.
[19] Ren, B.; Robert, F.; Wyrick, J.J.; Aparicio, O.; Jennings, E.G.; Simon, I.; Zeitlinger, J.; Schreiber, J.; Hannett, N.; Kanin, E.; Volkert, T.L.; Wilson, C.J.; Bell, S.P.; Young, R.A.; Genome-wide location and function of DNA binding proteins; Science: 2000; Volume 290 ,2306-2309.
[20] Roulet, E.; Busso, S.; Camargo, A.A.; Simpson, A.J.; Mermod, N.; Bucher, P.; High-throughput SELEX SAGE method for quantitative modeling of transcription-factor binding sites; Nat Biotechnol: 2002; Volume 20 ,831-835.
[21] Stoltenburg, R.; Reinemann, C.; Strehlitz, B.; SELEX–a (r)evolutionary method to generate high-affinity nucleic acid ligands; Biomol Eng: 2007; Volume 24 ,381-403.
[22] Hu, J.; Li, B.; Kihara, D.; Limitations and potentials of current motif discovery algorithms; Nucleic Acids Res: 2005; Volume 33 ,4899-4913.
[23] Sandve, G.K.; Abul, O.; Walseng, V.; Drablos, F.; Improved benchmarks for computational motif discovery; BMC Bioinformatics: 2007; Volume 8 ,193.
[24] Tompa, M.; Li, N.; Bailey, T.L.; Church, G.M.; De Moor, B.; Eskin, E.; Favorov, A.V.; Frith, M.C.; Fu, Y.; Kent, W.J.; Makeev, V.J.; Mironov, A.A.; Noble, W.S.; Pavesi, G.; Pesole, G.; Regnier, M.; Simonis, N.; Sinha, S.; Thijs, G.; van Helden, J.; Vandenbogaert, M.; Weng, Z.; Workman, C.; Ye, C.; Zhu, Z.; Assessing computational tools for the discovery of transcription factor binding sites; Nat Biotechnol: 2005; Volume 23 ,137-144.
[25] Klepper, K.; Sandve, G.K.; Abul, O.; Johansen, J.; Drablos, F.; Assessment of composite motif discovery methods; BMC Bioinformatics: 2008; Volume 9 ,123.
[26] Das, M.K.; Dai, H.K.; A survey of DNA motif finding algorithms; BMC Bioinformatics: 2007; Volume 8 ,S21.
[27] Kato, M.; Hata, N.; Banerjee, N.; Futcher, B.; Zhang, M.Q.; Identifying combinatorial regulation of transcription factors and binding motifs; Genome Biol: 2004; Volume 5 ,R56.
[28] Wang, J.; A new framework for identifying combinatorial regulation of transcription factors: a case study of the yeast cell cycle; J Biomed Inform: 2007; Volume 40 ,707-725.
[29] Brazma, A.; Jonassen, I.; Eidhammer, I.; Gilbert, D.; Approaches to the automatic discovery of patterns in biosequences; J Comput Biol: 1998; Volume 5 ,279-305.
[30] Pavesi, G.; Mauri, G.; Pesole, G.; In silico representation and discovery of transcription factor binding sites; Brief Bioinform: 2004; Volume 5 ,217-236.
[31] Wasserman, W.W.; Sandelin, A.; Applied bioinformatics for the identification of regulatory elements; Nat Rev Genet: 2004; Volume 5 ,276-287.
[32] Elnitski, L.; Jin, V.X.; Farnham, P.J.; Jones, S.J.; Locating mammalian transcription factor binding sites: a survey of computational and experimental techniques; Genome Res: 2006; Volume 16 ,1455-1464.
[33] Cornish-Bowden, A.; Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984; Nucleic Acids Res: 1985; Volume 13 ,3021-3030.
[34] Stormo, G.D.; Consensus patterns in DNA; Methods Enzymol: 1990; Volume 183 ,211-221.
[35] Quandt, K.; Frech, K.; Karas, H.; Wingender, E.; Werner, T.; MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data; Nucleic Acids Res: 1995; Volume 23 ,4878-4884.
[36] Chambers, A.; Stanway, C.; Tsang, J.S.; Henry, Y.; Kingsman, A.J.; Kingsman, S.M.; ARS binding factor 1 binds adjacent to RAP1 at the UASs of the yeast glycolytic genes PGK and PYK1; Nucleic Acids Res: 1990; Volume 18 ,5393-5399.
[37] Stormo, G.D.; DNA binding sites: representation and discovery; Bioinformatics: 2000; Volume 16 ,16-23.
[38] Kel, A.E.; Gossling, E.; Reuter, I.; Cheremushkin, E.; Kel-Margoulis, O.V.; Wingender, E.; MATCH: A tool for searching transcription factor binding sites in DNA sequences; Nucleic Acids Res: 2003; Volume 31 ,3576-3579.
[39] Salzberg, S.L.; A method for identifying splice sites and translational start sites in eukaryotic mRNA; Comput Appl Biosci: 1997; Volume 13 ,365-376.
[40] Bulyk, M.L.; Johnson, P.L.; Church, G.M.; Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors; Nucleic Acids Res: 2002; Volume 30 ,1255-1261.
[41] Man, T.K.; Stormo, G.D.; Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assay; Nucleic Acids Res: 2001; Volume 29 ,2471-2478.
[42] Ellrott, K.; Yang, C.; Sladek, F.M.; Jiang, T.; Identifying transcription factor binding sites through Markov chain optimization; Bioinformatics: 2002; Volume 18 ,S100-109.
[43] Burge, C.; Karlin, S.; Prediction of complete gene structures in human genomic DNA; J Mol Biol: 1997; Volume 268 ,78-94.
[44] Durbin, R.; Eddy, S.R.; Krogh, A.; Mitchison, G.; Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Acids; 1998; . · Zbl 0929.92010
[45] Thijs, G.; Lescot, M.; Marchal, K.; Rombauts, S.; De Moor, B.; Rouze, P.; Moreau, Y.; A higher-order background model improves the detection of promoter regulatory elements by Gibbs sampling; Bioinformatics: 2001; Volume 17 ,1113-1122.
[46] Ben-Gal, I.; Shani, A.; Gohr, A.; Grau, J.; Arviv, S.; Shmilovici, A.; Posch, S.; Grosse, I.; Identification of transcription factor binding sites with variable-order Bayesian networks; Bioinformatics: 2005; Volume 21 ,2657-2666.
[47] Cartharius, K.; Frech, K.; Grote, K.; Klocke, B.; Haltmeier, M.; Klingenhoff, A.; Frisch, M.; Bayerlein, M.; Werner, T.; MatInspector and beyond: promoter analysis based on transcription factor binding sites; Bioinformatics: 2005; Volume 21 ,2933-2942.
[48] Chekmenev, D.S.; Haid, C.; Kel, A.E.; P-Match: transcription factor binding site search by combining patterns and weight matrices; Nucleic Acids Res: 2005; Volume 33 ,W432-437.
[49] Gershenzon, N.I.; Stormo, G.D.; Ioshikhes, I.P.; Computational technique for improvement of the position-weight matrices for the DNA/protein binding sites; Nucleic Acids Res: 2005; Volume 33 ,2290-2301.
[50] Sandelin, A.; Wasserman, W.W.; Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics; J Mol Biol: 2004; Volume 338 ,207-215.
[51] Hannenhalli, S.; Wang, L.S.; Enhanced position weight matrices using mixture models; Bioinformatics: 2005; Volume 21 ,i204-212.
[52] Boyle, A.P.; Davis, S.; Shulha, H.P.; Meltzer, P.; Margulies, E.H.; Weng, Z.; Furey, T.S.; Crawford, G.E.; High-resolution mapping and characterization of open chromatin across the genome; Cell: 2008; Volume 132 ,311-322.
[53] Genomatix; ; .
[54] Scherf, M.; Klingenhoff, A.; Werner, T.; Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach; J Mol Biol: 2000; Volume 297 ,599-606.
[55] Bajic, V.B.; Seah, S.H.; Dragon gene start finder: an advanced system for finding approximate locations of the start of gene transcriptional units; Genome Res: 2003; Volume 13 ,1923-1929.
[56] Won, H.H.; Kim, M.J.; Kim, S.; Kim, J.W.; EnsemPro: an ensemble approach to predicting transcription start sites in human genomic DNA sequences; Genomics: 2008; Volume 91 ,259-266.
[57] Bajic, V.B.; Brent, M.R.; Brown, R.H.; Frankish, A.; Harrow, J.; Ohler, U.; Solovyev, V.V.; Tan, S.L.; Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment; Genome Biol: 2006; Volume 7 ,1-13.
[58] Pedersen, A.G.; Baldi, P.; Chauvin, Y.; Brunak, S.; The biology of eukaryotic promoter prediction–a review; Comput Chem: 1999; Volume 23 ,191-207.
[59] Qiu, P.; Recent advances in computational promoter analysis in understanding the transcriptional regulatory network; Biochem Biophys Res Commun: 2003; Volume 309 ,495-501.
[60] Werner, T.; The state of the art of mammalian promoter recognition; Brief Bioinform: 2003; Volume 4 ,22-30.
[61] Davuluri, R.V.; Suzuki, Y.; Sugano, S.; Plass, C.; Huang, T.H.; The functional consequences of alternative promoter use in mammalian genomes; Trends Genet: 2008; Volume 24 ,167-177.
[62] Kapranov, P.; Willingham, A.T.; Gingeras, T.R.; Genome-wide transcription and the implications for genomic organization; Nat Rev Genet: 2007; Volume 8 ,413-423.
[63] Sandelin, A.; Carninci, P.; Lenhard, B.; Ponjavic, J.; Hayashizaki, Y.; Hume, D.A.; Mammalian RNA polymerase II core promoters: insights from genome-wide studies; Nat Rev Genet: 2007; Volume 8 ,424-436.
[64] Hertz, G.Z.; Hartzell, G.W.; Stormo, G.D.; Identification of consensus patterns in unaligned DNA sequences known to be functionally related; Comput Appl Biosci: 1990; Volume 6 ,81-92.
[65] Lawrence, C.E.; Altschul, S.F.; Boguski, M.S.; Liu, J.S.; Neuwald, A.F.; Wootton, J.C.; Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment; Science: 1993; Volume 262 ,208-214.
[66] Bailey, T.L.; Elkan, C.; Fitting a mixture model by expectation maximization to discover motifs in biopolymers; Proc Int Conf Intell Syst Mol Biol: 1994; Volume 2 ,28-36.
[67] Tung, N.T.; Yang, E.; Androulakis, I.P.; Machine learning approaches in promoter sequence analysis; Machine Learning Research Progress: 2008; .
[68] Marsan, L.; Sagot, M.F.; Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification; J Comput Biol: 2000; Volume 7 ,345-362.
[69] Hertz, G.Z.; Stormo, G.D.; Identifying DNA and protein patterns with statistically significant alignments of multiple sequences; Bioinformatics: 1999; Volume 15 ,563-577.
[70] Vlieghe, D.; Sandelin, A.; De Bleser, P.J.; Vleminckx, K.; Wasserman, W.W.; van Roy, F.; Lenhard, B.; A new generation of JASPAR, the open-access repository for transcription factor binding site profiles; Nucleic Acids Res: 2006; Volume 34 ,D95-97.
[71] Wingender, E.; Dietze, P.; Karas, H.; Knuppel, R.; TRANSFAC: a database on transcription factors and their DNA binding sites; Nucleic Acids Res: 1996; Volume 24 ,238-241.
[72] Venter, J.C.; Adams, M.D.; Myers, E.W.; Li, P.W.; Mural, R.J.; Sutton, G.G.; Smith, H.O.; Yandell, M.; Evans, C.A.; Holt, R.A.; Gocayne, J.D.; Amanatides, P.; Ballew, R.M.; Huson, D.H.; Wortman, J.R.; Zhang, Q.; Kodira, C.D.; Zheng, X.H.; Chen, L.; Skupski, M.; Subramanian, G.; Thomas, P.D.; Zhang, J.; Gabor Miklos, G.L.; Nelson, C.; Broder, S.; Clark, A.G.; Nadeau, J.; McKusick, V.A.; Zinder, N.; Levine, A.J.; Roberts, R.J.; Simon, M.; Slayman, C.; Hunkapiller, M.; Bolanos, R.; Delcher, A.; Dew, I.; Fasulo, D.; Flanigan, M.; Florea, L.; Halpern, A.; Hannenhalli, S.; Kravitz, S.; Levy, S.; Mobarry, C.; Reinert, K.; Remington, K.; Abu-Threideh, J.; Beasley, E.; Biddick, K.; Bonazzi, V.; Brandon, R.; Cargill, M.; Chandramouliswaran, I.; Charlab, R.; Chaturvedi, K.; Deng, Z.; Di Francesco, V.; Dunn, P.; Eilbeck, K.; Evangelista, C.; Gabrielian, A.E.; Gan, W.; Ge, W.; Gong, F.; Gu, Z.; Guan, P.; Heiman, T.J.; Higgins, M.E.; Ji, R.R.; Ke, Z.; Ketchum, K.A.; Lai, Z.; Lei, Y.; Li, Z.; Li, J.; Liang, Y.; Lin, X.; Lu, F.; Merkulov, G.V.; Milshina, N.; Moore, H.M.; Naik, A.K.; Narayan, V.A.; Neelam, B.; Nusskern, D.; Rusch, D.B.; Salzberg, S.; Shao, W.; Shue, B.; Sun, J.; Wang, Z.; Wang, A.; Wang, X.; Wang, J.; Wei, M.; Wides, R.; Xiao, C.; Yan, C.; Yao, A.; Ye, J.; Zhan, M.; Zhang, W.; Zhang, H.; Zhao, Q.; Zheng, L.; Zhong, F.; Zhong, W.; Zhu, S.; Zhao, S.; Gilbert, D.; Baumhueter, S.; Spier, G.; Carter, C.; Cravchik, A.; Woodage, T.; Ali, F.; An, H.; Awe, A.; Baldwin, D.; Baden, H.; Barnstead, M.; Barrow, I.; Beeson, K.; Busam, D.; Carver, A.; Center, A.; Cheng, M.L.; Curry, L.; Danaher, S.; Davenport, L.; Desilets, R.; Dietz, S.; Dodson, K.; Doup, L.; Ferriera, S.; Garg, N.; Gluecksmann, A.; Hart, B.; Haynes, J.; Haynes, C.; Heiner, C.; Hladun, S.; Hostin, D.; Houck, J.; Howland, T.; Ibegwam, C.; Johnson, J.; Kalush, F.; Kline, L.; Koduru, S.; Love, A.; Mann, F.; May, D.; McCawley, S.; McIntosh, T.; McMullen, I.; Moy, M.; Moy, L.; Murphy, B.; Nelson, K.; Pfannkoch, C.; Pratts, E.; Puri, V.; Qureshi, H.; Reardon, M.; Rodriguez, R.; Rogers, Y.H.; Romblad, D.; Ruhfel, B.; Scott, R.; Sitter, C.; Smallwood, M.; Stewart, E.; Strong, R.; Suh, E.; Thomas, R.; Tint, N.N.; Tse, S.; Vech, C.; Wang, G.; Wetter, J.; Williams, S.; Williams, M.; Windsor, S.; Winn-Deen, E.; Wolfe, K.; Zaveri, J.; Zaveri, K.; Abril, J.F.; Guigo, R.; Campbell, M.J.; Sjolander, KV.; Karlak, B.; Kejariwal, A.; Mi, H.; Lazareva, B.; Hatton, T.; Narechania, A.; Diemer, K.; Muruganujan, A.; Guo, N.; Sato, S.; Bafna, V.; Istrail, S.; Lippert, R.; Schwartz, R.; Walenz, B.; Yooseph, S.; Allen, D.; Basu, A.; Baxendale, J.; Blick, L.; Caminha, M.; Carnes-Stine, J.; Caulk, P.; Chiang, Y.H.; Coyne, M.; Dahlke, C.; Mays, A.; Dombroski, M.; Donnelly, M.; Ely, D.; Esparham, S.; Fosler, C.; Gire, H.; Glanowski, S.; Glasser, K.; Glodek, A.; Gorokhov, M.; Graham, K.; Gropman, B.; Harris, M.; Heil, J.; Henderson, S.; Hoover, J.; Jennings, D.; Jordan, C.; Jordan, J.; Kasha, J.; Kagan, L.; Kraft, C.; Levitsky, A.; Lewis, M.; Liu, X.; Lopez, J.; Ma, D.; Majoros, W.; McDaniel, J.; Murphy, S.; Newman, M.; Nguyen, T.; Nguyen, N.; Nodell, M.; Pan, S.; Peck, J.; Peterson, M.; Rowe, W.; Sanders, R.; Scott, J.; Simpson, M.; Smith, T.; Sprague, A.; Stockwell, T.; Turner, R.; Venter, E.; Wang, M.; Wen, M.; Wu, D.; Wu, M.; Xia, A.; Zandieh, A.; Zhu, X.; The sequence of the human genome; Science: 2001; Volume 291 ,1304-1351.
[73] Friberg, M.; von Rohr, P.; Gonnet, G.; Scoring functions for transcription factor binding site prediction; BMC Bioinformatics: 2005; Volume 6 ,84.
[74] Li, N.; Tompa, M.; Analysis of computational approaches for motif discovery; Algorithms Mol Biol: 2006; Volume 1 ,8.
[75] Doniger, S.W.; Huh, J.; Fay, J.C.; Identification of functional transcription factor binding sites using closely related Saccharomyces species; Genome Res: 2005; Volume 15 ,701-709.
[76] Cliften, P.; Sudarsanam, P.; Desikan, A.; Fulton, L.; Fulton, B.; Majors, J.; Waterston, R.; Cohen, B.A.; Johnston, M.; Finding functional features in Saccharomyces genomes by phylogenetic footprinting; Science: 2003; Volume 301 ,71-76.
[77] Gibbs, R.A.; Weinstock, G.M.; Metzker, M.L.; Muzny, D.M.; Sodergren, E.J.; Scherer, S.; Scott, G.; Steffen, D.; Worley, K.C.; Burch, P.E.; Okwuonu, G.; Hines, S.; Lewis, L.; DeRamo, C.; Delgado, O.; Dugan-Rocha, S.; Miner, G.; Morgan, M.; Hawes, A.; Gill, R.; Celera, R.A.; Holt, M.D.; Adams, P.G.; Amanatides, H.; Baden-Tillson, M.; Barnstead, S.; Chin, C.A.; Evans, S.; Ferriera, C.; Fosler, A.; Glodek, Z.; Gu, D.; Jennings, C.L.; Kraft, T.; Nguyen, C.M.; Pfannkoch, C.; Sitter, G.G.; Sutton, J.C.; Venter, T.; Woodage, D.; Smith, H.M.; Lee, E.; Gustafson, P.; Cahill, A.; Kana, L.; Doucette-Stamm, K.; Weinstock, K.; Fechtel, R.B.; Weiss, D.M.; Dunn, E.D.; Green, R.W.; Blakesley, G.G.; Bouffard, P.J.; De Jong, K.; Osoegawa, B.; Zhu, M.; Marra, J.; Schein, I.; Bosdet, C.; Fjell, S.; Jones, M.; Krzywinski, C.; Mathewson, A.; Siddiqui, N.; Wye, J.; McPherson, S.; Zhao, C.M.; Fraser, J.; Shetty, S.; Shatsman, K.; Geer, Y.; Chen, S.; Abramzon, W.C.; Nierman, P.H.; Havlak, R.; Chen, K.J.; Durbin, A.; Egan, Y.; Ren, X.Z.; Song, B.; Li, Y.; Liu, X.; Qin, S.; Cawley, K.C.; Worley, A.J.; Cooney, L.M.; D’Souza, K.; Martin, J.Q.; Wu, M.L.; Gonzalez-Garay, A.R.; Jackson, K.J.; Kalafus, M.P.; McLeod, A.; Milosavljevic, D.; Virk, A.; Volkov, D.A.; Wheeler, Z.; Zhang, J.A.; Bailey, E.E.; Eichler, E.; Tuzun, E.; Birney, E.; Mongin, A.; Ureta-Vidal, C.; Woodwark, E.; Zdobnov, P.; Bork, M.; Suyama, D.; Torrents, M.; Alexandersson, B.J.; Trask, J.M.; Young, H.; Huang, H.; Wang, H.; Xing, S.; Daniels, D.; Gietzen, J.; Schmidt, K.; Stevens, U.; Vitt, J.; Wingrove, F.; Camara, M.; Mar Alba, J.F.; Abril, R.; Guigo, A.; Smit, I.; Dubchak, E.M.; Rubin, O.; Couronne, A.; Poliakov, N.; Hubner, D.; Ganten, C.; Goesele, O.; Hummel, T.; Kreitler, Y.A.; Lee, J.; Monti, H.; Schulz, H.; Zimdahl, H.; Himmelbauer, H.; Lehrach, H.J.; Jacob, S.; Bromberg, J.; Gullings-Handley, M.I.; Jensen-Seaman, AE.; Kwitek, J.; Lazar, D.; Pasko, P.J.; Tonellato, S.; Twigger, C.P.; Ponting, J.M.; Duarte, S.; Rice, L.; Goodstadt, S.A.; Beatson, R.D.; Emes, E.E.; Winter, C.; Webber, P.; Brandt, G.; Nyakatura, M.; Adetobi, F.; Chiaromonte, L.; Elnitski, P.; Eswara, R.C.; Hardison, M.; Hou, D.; Kolbe, K.; Makova, W.; Miller, A.; Nekrutenko, C.; Riemer, S.; Schwartz, J.; Taylor, S.; Yang, Y.; Zhang, K.; Lindpaintner, T.D.; Andrews, M.; Caccamo, M.; Clamp, L.; Clarke, V.; Curwen, R.; Durbin, E.; Eyras, S.M.; Searle, G.M.; Cooper, S.; Batzoglou, M.; Brudno, A.; Sidow, E.A.; Stone, J.C.; Venter, B.A.; Payseur, G.; Bourque, C.; Lopez-Otin, X.S.; Puente, K.; Chakrabarti, S.; Chatterji, C.; Dewey, L.; Pachter, N.; Bray, V.B.; Yap, A.; Caspi, G.; Tesler, P.A.; Pevzner, D.; Haussler, K.M.; Roskin, R.; Baertsch, H.; Clawson, T.S.; Furey, A.S.; Hinrichs, D.; Karolchik, W.J.; Kent, K.R.; Rosenbloom, H.; Trumbower, M.; Weirauch, D.N.; Cooper, P.D.; Stenson, B.; Ma, M.; Brent, M.; Arumugam, D.; Shteynberg, R.R.; Copley, M.S.; Taylor, H.; Riethman, U.; Mudunuri, J.; Peterson, M.; Guyer, A.; Felsenfeld, S.; Old, S.; Mockrin, F.; Collins, ; Genome sequence of the Brown Norway rat yields insights into mammalian evolution; Nature: 2004; Volume 428 ,493-521.
[78] Brudno, M.; Do, C.B.; Cooper, G.M.; Kim, M.F.; Davydov, E.; Green, E.D.; Sidow, A.; Batzoglou, S.; LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA; Genome Res: 2003; Volume 13 ,721-731.
[79] Thompson, J.D.; Higgins, D.G.; Gibson, T.J.; CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice; Nucleic Acids Res: 1994; Volume 22 ,4673-4680.
[80] Morgenstern, B.; DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment; Bioinformatics: 1999; Volume 15 ,211-218.
[81] Notredame, C.; Higgins, D.G.; Heringa, J.; T-Coffee: A novel method for fast and accurate multiple sequence alignment; J Mol Biol: 2000; Volume 302 ,205-217.
[82] Siddharthan, R.; Sigma: multiple alignment of weakly-conserved non-coding DNA sequence; BMC Bioinformatics: 2006; Volume 7 ,143.
[83] Cliften, P.F.; Hillier, L.W.; Fulton, L.; Graves, T.; Miner, T.; Gish, W.R.; Waterston, R.H.; Johnston, M.; Surveying Saccharomyces genomes to identify functional elements by comparative DNA sequence analysis; Genome Res: 2001; Volume 11 ,1175-1186.
[84] Tompa, M.; Identifying functional elements by comparative DNA sequence analysis; Genome Res: 2001; Volume 11 ,1143-1144.
[85] Blanchette, M.; Tompa, M.; Discovery of regulatory elements by a computational method for phylogenetic footprinting; Genome Res: 2002; Volume 12 ,739-748.
[86] McCue, L.; Thompson, W.; Carmack, C.; Ryan, M.P.; Liu, J.S.; Derbyshire, V.; Lawrence, C.E.; Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes; Nucleic Acids Res: 2001; Volume 29 ,774-782.
[87] Wang, T.; Stormo, G.D.; Combining phylogenetic data with co-regulated genes to identify regulatory motifs; Bioinformatics: 2003; Volume 19 ,2369-2380.
[88] Berezikov, E.; Guryev, V.; Plasterk, R.H.; Cuppen, E.; CONREAL: conserved regulatory elements anchored alignment algorithm for identification of transcription factor binding sites by phylogenetic footprinting; Genome Res: 2004; Volume 14 ,170-178.
[89] Blanchette, M.; Tompa, M.; FootPrinter: A program designed for phylogenetic footprinting; Nucleic Acids Res: 2003; Volume 31 ,3840-3842.
[90] Moses, A.M.; Chiang, D.Y.; Eisen, M.B.; Phylogenetic motif detection by expectation-maximization on evolutionary mixtures; Pac Symp Biocomput: 2004; ,324-335.
[91] Jukes, T.H.C.R.C.; Evolution of protein molecules; Mammalian protein metabolism: New York 1969; ,21-123.
[92] Sinha, S.; PhyME: a software tool for finding motifs in sets of orthologous sequences; Methods Mol Biol: 2007; Volume 395 ,309-318.
[93] Sinha, S.; Blanchette, M.; Tompa, M.; PhyME: a probabilistic algorithm for finding motifs in sets of orthologous sequences; BMC Bioinformatics: 2004; Volume 5 ,170.
[94] Siddharthan, R.; PhyloGibbs-MP: module prediction and discriminative motif-finding by Gibbs sampling; PLoS Comput Biol: 2008; Volume 4 ,e1000156.
[95] Siddharthan, R.; Siggia, E.D.; van Nimwegen, E.; PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny; PLoS Comput Biol: 2005; Volume 1 ,e67.
[96] Sinha, S.; van Nimwegen, E.; Siggia, E.D.; A probabilistic method to detect regulatory modules; Bioinformatics: 2003; Volume 19 ,i292-301.
[97] Felsenstein, J.; Evolutionary trees from DNA sequences: a maximum likelihood approach; J Mol Evol: 1981; Volume 17 ,368-376.
[98] Moses, A.M.; Chiang, D.Y.; Pollard, D.A.; Iyer, V.N.; Eisen, M.B.; MONKEY: identifying conserved transcription-factor binding sites in multiple alignments using a binding site-specific evolutionary model; Genome Biol: 2004; Volume 5 ,R98.
[99] Hasegawa, M.; Kishino, H.; Yano, T.; Dating of the human-ape splitting by a molecular clock of mitochondrial DNA; J Mol Evol: 1985; Volume 22 ,160-174.
[100] Gertz, J.; Fay, J.C.; Cohen, B.A.; Phylogeny based discovery of regulatory elements; BMC Bioinformatics: 2006; Volume 7 ,266.
[101] Carmack, C.S.; McCue, L.A.; Newberg, L.A.; Lawrence, C.E.; PhyloScan: identification of transcription factor binding sites using cross-species evidence; Algorithms Mol Biol: 2007; Volume 2 ,1.
[102] Harbison, C.T.; Gordon, D.B.; Lee, T.I.; Rinaldi, N.J.; Macisaac, K.D.; Danford, T.W.; Hannett, N.M.; Tagne, J.B.; Reynolds, D.B.; Yoo, J.; Jennings, E.G.; Zeitlinger, J.; Pokholok, D.K.; Kellis, M.; Rolfe, P.A.; Takusagawa, K.T.; Lander, E.S.; Gifford, D.K.; Fraenkel, E.; Young, R.A.; Transcriptional regulatory code of a eukaryotic genome; Nature: 2004; Volume 431 ,99-104.
[103] Lee, H.G.; Lee, H.S.; Jeon, S.H.; Chung, T.H.; Lim, Y.S.; Huh, W.K.; High-resolution analysis of condition-specific regulatory modules in Saccharomyces cerevisiae; Genome Biol: 2008; Volume 9 ,R2.
[104] McCord, R.P.; Berger, M.F.; Philippakis, A.A.; Bulyk, M.L.; Inferring condition-specific transcription factor function from DNA binding and gene expression data; Mol Syst Biol: 2007; Volume 3 ,100.
[105] Smith, A.D.; Sumazin, P.; Zhang, M.Q.; Tissue-specific regulatory elements in mammalian promoters; Mol Syst Biol: 2007; Volume 3 ,73.
[106] Yu, X.; Lin, J.; Zack, D.J.; Qian, J.; Identification of tissue-specific cis-regulatory modules based on interactions between transcription factors; BMC Bioinformatics: 2007; Volume 8 ,437.
[107] Fessele, S.; Maier, H.; Zischek, C.; Nelson, P.J.; Werner, T.; Regulatory context is a crucial part of gene function; Trends Genet: 2002; Volume 18 ,60-63.
[108] Allocco, D.J.; Kohane, I.S.; Butte, A.J.; Quantifying the relationship between co-expression, co-regulation and gene function; BMC Bioinformatics: 2004; Volume 5 ,18.
[109] Long, F.; Liu, H.; Hahn, C.; Sumazin, P.; Zhang, M.Q.; Zilberstein, A.; Genome-wide prediction and analysis of function-specific transcription factor binding sites; In Silico Biol: 2004; Volume 4 ,395-410.
[110] Frech, K.; Danescu-Mayer, J.; Werner, T.; A novel method to develop highly specific models for regulatory units detects a new LTR in GenBank which contains a functional promoter; J Mol Biol: 1997; Volume 270 ,674-687.
[111] Frith, M.C.; Li, M.C.; Weng, Z.; Cluster-Buster: Finding dense clusters of motifs in DNA sequences; Nucleic Acids Res: 2003; Volume 31 ,3666-3668.
[112] Zhou, Q.; Wong, W.H.; CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling; Proc Natl Acad Sci U S A: 2004; Volume 101 ,12114-12119.
[113] Alkema, W.B.; Johansson, O.; Lagergren, J.; Wasserman, W.W.; MSCAN: identification of functional clusters of transcription factor binding sites; Nucleic Acids Res: 2004; Volume 32 ,W195-198.
[114] Pierstorff, N.; Bergman, C.M.; Wiehe, T.; Identifying cis-regulatory modules by combining comparative and compositional analysis of DNA; Bioinformatics: 2006; Volume 22 ,2858-2864.
[115] Van Loo, P.; Aerts, S.; Thienpont, B.; De Moor, B.; Moreau, Y.; Marynen, P.; ModuleMiner - improved computational detection of cis-regulatory modules: are there different modes of gene regulation in embryonic development and adult tissues?; Genome Biol: 2008; Volume 9 ,R66.
[116] Gotea, V.; Ovcharenko, I.; DiRE: identifying distant regulatory elements of co-expressed genes; Nucleic Acids Res: 2008; Volume 36 ,W133-139.
[117] Waleev, T.; Shtokalo, D.; Konovalova, T.; Voss, N.; Cheremushkin, E.; Stegmaier, P.; Kel-Margoulis, O.; Wingender, E.; Kel, A.; Composite Module Analyst: identification of transcription factor binding site combinations using genetic algorithm; Nucleic Acids Res: 2006; Volume 34 ,W541-545.
[118] Roth, F.P.; Hughes, J.D.; Estep, P.W.; Church, G.M.; Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation; Nat Biotechnol: 1998; Volume 16 ,939-945.
[119] Tavazoie, S.; Hughes, J.D.; Campbell, M.J.; Cho, R.J.; Church, G.M.; Systematic determination of genetic network architecture; Nat Genet: 1999; Volume 22 ,281-285.
[120] Lockhart, D.J.; Winzeler, E.A.; Genomics, gene expression and DNA arrays; Nature: 2000; Volume 405 ,827-836.
[121] Flintoft, L.; Gene regulation: The many paths to coexpression; Nature Reviews Genetics: 2007; Volume 8 ,827.
[122] Choi, D.; Fang, Y.; Mathers, W.D.; Condition-specific coregulation with cis-regulatory motifs and modules in the mouse genome; Genomics: 2006; Volume 87 ,500-508.
[123] Huang, R.; Wallqvist, A.; Covell, D.G.; Comprehensive analysis of pathway or functionally related gene expression in the National Cancer Institute’s anticancer screen; Genomics: 2006; Volume 87 ,315-328.
[124] Segal, E.; Shapira, M.; Regev, A.; Pe’er, D.; Botstein, D.; Koller, D.; Friedman, N.; Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data; Nat Genet: 2003; Volume 34 ,166-176.
[125] Elkon, R.; Linhart, C.; Sharan, R.; Shamir, R.; Shiloh, Y.; Genome-wide in silico identification of transcriptional regulators controlling the cell cycle in human cells; Genome Res: 2003; Volume 13 ,773-780.
[126] Cora, D.; Herrmann, C.; Dieterich, C.; Di Cunto, F.; Provero, P.; Caselle, M.; Ab initio identification of putative human transcription factor binding sites by comparative genomics; BMC Bioinformatics: 2005; Volume 6 ,110.
[127] Defrance, M.; Touzet, H.; Predicting transcription factor binding sites using local over-representation and comparative genomics; BMC Bioinformatics: 2006; Volume 7 ,396.
[128] Monsieurs, P.; Thijs, G.; Fadda, A.A.; De Keersmaecker, S.C.; Vanderleyden, J.; De Moor, B.; Marchal, K.; More robust detection of motifs in coexpressed genes by using phylogenetic information; BMC Bioinformatics: 2006; Volume 7 ,160.
[129] Vandepoele, K.; Casneuf, T.; Van de Peer, Y.; Identification of novel regulatory modules in dicotyledonous plants using expression data and comparative genomics; Genome Biol: 2006; Volume 7 ,R103.
[130] King, D.C.; Taylor, J.; Elnitski, L.; Chiaromonte, F.; Miller, W.; Hardison, R.C.; Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences; Genome Res: 2005; Volume 15 ,1051-1060.
[131] Kolbe, D.; Taylor, J.; Elnitski, L.; Eswara, P.; Li, J.; Miller, W.; Hardison, R.; Chiaromonte, F.; Regulatory potential scores from genome-wide three-way alignments of human, mouse, and rat; Genome Res: 2004; Volume 14 ,700-707.
[132] Taylor, J.; Tyekucheva, S.; King, D.C.; Hardison, R.C.; Miller, W.; Chiaromonte, F.; ESPERR: learning strong and weak signals in genomic sequence alignments to identify functional elements; Genome Res: 2006; Volume 16 ,1596-1604.
[133] Wang, H.; Zhang, Y.; Cheng, Y.; Zhou, Y.; King, D.C.; Taylor, J.; Chiaromonte, F.; Kasturi, J.; Petrykowska, H.; Gibb, B.; Dorman, C.; Miller, W.; Dore, L.C.; Welch, J.; Weiss, M.J.; Hardison, R.C.; Experimental validation of predicted mammalian erythroid cis-regulatory modules; Genome Res: 2006; Volume 16 ,1480-1492.
[134] Seifert, M.; Scherf, M.; Epple, A.; Werner, T.; Multievidence microarray mining; Trends Genet: 2005; Volume 21 ,553-558.
[135] Gonye, G.E.; Chakravarthula, P.; Schwaber, J.S.; Vadigepalli, R.; From promoter analysis to transcriptional regulatory network prediction using PAINT; Methods Mol Biol: 2007; Volume 408 ,49-68.
[136] Vadigepalli, R.; Chakravarthula, P.; Zak, D.E.; Schwaber, J.S.; Gonye, G.E.; PAINT: a promoter analysis and interaction network generation tool for gene regulatory network identification; Omics: 2003; Volume 7 ,235-252.
[137] Haverty, P.M.; Frith, M.C.; Weng, Z.; CARRIE web service: automated transcriptional regulatory network inference and interactive analysis; Nucleic Acids Res: 2004; Volume 32 ,W213-216.
[138] Haverty, P.M.; Hansen, U.; Weng, Z.; Computational inference of transcriptional regulatory networks from expression profiling and transcription factor binding site identification; Nucleic Acids Res: 2004; Volume 32 ,179-188.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.