The potential of family-free genome comparison. (English) Zbl 1462.92037

Chauve, Cedric (ed.) et al., Models and algorithms for genome evolution. Selected contributions based on the presentations at the MAGE conference, Montréal, Canada, August 23–26, 2013. London: Springer. Comput. Biol. 19, 287-307 (2013).
Summary: Many methods in computational comparative genomics require gene family assignments as a prerequisite. While the biological concept of gene families is well established, their computational prediction remains unreliable. This paper continues a new line of research in which family assignments are not presumed. We study the potential of several family-free approaches in detecting conserved structures, genome rearrangements and in reconstructing ancestral gene orders.
For the entire collection see [Zbl 1274.92002].


92D10 Genetics and epigenetics
Full Text: DOI Link


[1] Angibaud, S., Fertin, G., Rusu, I., Thévenin, A., Vialette, S.: Efficient tools for computing the number of breakpoints and the number of adjacencies between two genomes with duplicate genes. J. Comput. Biol. 15(8), 1093-1115 (2008)
[2] Angibaud, S., Fertin, G., Rusu, I., Thévenin, A., Vialette, S.: On the approximability of comparing genomes with duplicates. J. Graph Algorithms Appl. 13(1), 19-53 (2009) · Zbl 1170.68049
[3] Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. the gene ontology consortium. Nat. Genet. 25(1), 25-29 (2000)
[4] Bergeron, A., Stoye, J.: On the similarity of sets of permutations and its applications to genome comparison. J. Comput. Biol. 13(7), 1340-1354 (2006)
[5] Bergeron, A., Corteel, S., Raffinot, M.: The algorithmic of gene teams. In: Proceedings of WABI 2002. LNCS, vol. 2452, pp. 464-476 (2002) · Zbl 1016.68618
[6] Bergeron, A., Mixtacki, J., Stoye, J.: On sorting by translocations. J. Comput. Biol. 13(2), 567-578 (2006) · Zbl 1119.92312
[7] Bergeron, A., Mixtacki, J., Stoye, J.: A unifying view of genome rearrangements. In: Proceedings of WABI 2006. LNBI, vol. 4175, pp. 163-173 (2006)
[8] Bernt, M., Merkle, D., Middendorf, M.: Solving the preserving reversal median problem. IEEE/ACM Trans. Comput. Biol. Bioinform. 5, 332-347 (2008)
[9] Blin, G., Chauve, C., Fertin, G.: The breakpoint distance for signed sequences. In: Proceedings of CompBioNets 2004. Texts in Algorithmics, vol. 3, pp. 3-16 (2004)
[10] Blin, G., Chateau, A., Chauve, C., Gingras, Y.: Inferring positional homologs with common intervals of sequences. In: Proceedings of RECOMB-CG 2006, pp. 24-38. Springer, Berlin (2006)
[11] Blin, G., Chauve, C., Fertin, G., Rizzi, R., Vialette, S.: Comparing genomes with duplications: a computational complexity point of view. IEEE/ACM Trans. Comput. Biol. Bioinform. 4(4), 523-534 (2007)
[12] Böcker, S., Jahn, K., Mixtacki, J., Stoye, J.: Computation of median gene clusters. J. Comput. Biol. 16(8), 1085-1099 (2009)
[13] Bourque, G., Pevzner, P.A.: Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res. 12(1), 26-36 (2002)
[14] Braga, M.D.V., Willing, E., Stoye, J.: Double cut and join with insertions and deletions. J. Comput. Biol. 18(9), 1167-1184 (2011)
[15] Caprara, A.: The reversal median problem. INFORMS J. Comput. 15(1), 93-113 (2003) · Zbl 1238.90099
[16] Chauve, C., Tannier, E.: A methodological framework for the reconstruction of contiguous regions of ancestral genomes and its application to mammalian genomes. PLoS Comput. Biol. 4(11), e1000234 (2008)
[17] Chauve, C., El-Mabrouk, N., Guéguen, L., Semeria, M., Tannier, E.: Duplication, rearrangement and reconciliation: a follow-up 13 years later. In: Chauve, C. et al. (eds.) Models and Algorithms for Genome Evolution. Computational Biology, vol. 19. Springer, Berlin (2013). In this volume · Zbl 1274.92002
[18] Csurös, M.: Count: evolutionary analysis of phylogenetic profiles with parsimony and likelihood. Bioinformatics 26(15), 1910-1912 (2010)
[19] Darling, A.E., Mau, B., Perna, N.T.: ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE 5(6), e11147 (2010)
[20] Dewey, C.N.: Positional orthology: putting genomic evolutionary relationships into context. Brief. Bioinform. 12(5), 401-412 (2011)
[21] Didier, G., Schmidt, T., Stoye, J., Tsur, D.: Character sets of strings. J. Discrete Algorithms 5(2), 330-340 (2007) · Zbl 1125.68134
[22] Doerr, D., Thévenin, A., Stoye, J.: Gene family assignment-free comparative genomics. BMC Bioinform. 13(Suppl 19), S3 (2012)
[23] Durand, D., Sankoff, D.: Tests for gene clustering. J. Comput. Biol. 10, 453-482 (2003)
[24] Earnest-DeYoung, J.V., Lerat, E., Moret, B.M.E.: Reversing gene erosion—reconstructing ancestral bacterial genomes from gene-content and order data. In: Proceedings of WABI 2004. LNCS, vol. 3240, pp. 1-13 (2004)
[25] El-Mabrouk, N.: Sorting signed permutations by reversals and insertions/deletions of contiguous segments. J. Discrete Algorithms 1(1), 105-122 (2001)
[26] Feijão, P., Meidanis, J.: SCJ: a breakpoint-like distance that simplifies several rearrangement problems. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(5), 1318-1329 (2011)
[27] Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of Genome Rearrangements. MIT Press, Cambridge (2009) · Zbl 1170.92022
[28] Frech, C., Chen, N.: Genome-wide comparative gene family classification. PLoS ONE 5(10), e13409 (2010)
[29] Fu, Z., Chen, X., Vacic, V., Nan, P., Zhong, Y., Jiang, T.: MSOAR: a high-throughput ortholog assignment system based on genome rearrangement. J. Comput. Biol. 14(9), 1160-1175 (2007)
[30] Hannenhalli, S., Pevzner, P.A.: Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals. J. ACM 46(1), 1-27 (1999) · Zbl 1064.92510
[31] He, X., Goldwasser, M.H.: Identifying conserved gene clusters in the presence of homology families. J. Comput. Biol. 12(6), 638-656 (2005)
[32] Heber, S., Stoye, J.: Algorithms for finding gene clusters. In: Proceedings of WABI 2001. LNCS, vol. 2149, pp. 252-263 (2001) · Zbl 1129.92305
[33] Heber, S., Mayr, R., Stoye, J.: Common intervals of multiple permutations. Algorithmica 60(2), 175-206 (2011) · Zbl 1215.68163
[34] Jahn, K.: Efficient computation of approximate gene clusters based on reference occurrences. J. Comput. Biol. 18(9), 1255-1274 (2011)
[35] Kuhn, H.W.: The Hungarian method for the assignment problem. Nav. Res. Logist. Q. 2(1-2), 83-97 (2006)
[36] Li, L., Stoeckert, C.J., Roos, D.S.: OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13(9), 2178-2189 (2003)
[37] Ma, J., Ratan, A., Raney, B.J., Suh, B.B., Zhang, L., Miller, W., Haussler, D.: DUPCAR: reconstructing contiguous ancestral regions with duplications. J. Comput. Biol. 15(8), 1007-1027 (2008)
[38] Manuch, J., Patterson, M., Wittler, R., Chauve, C., Tannier, E.: Linearization of ancestral multichromosomal genomes. BMC Bioinform. 13(Suppl 19), S11 (2012)
[39] Milinkovitch, M.C., Helaers, R., Depiereux, E., Tzika, A.C., Gabaldon, T.: 2× genomes—depth does matter. Genome Biol. 11, R6 (2010)
[40] Ostlund, G., Schmitt, T., Forslund, K., Köstler, T., Messina, D.N., Roopra, S., Frings, O., Sonnhammer, E.L.L.: InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res. 38(Database issue), D196-D203 (2010)
[41] Pe’er, I., Shamir, R.: The median problems for breakpoints are NP-complete. Electron. Colloq. Comput. Complex. 71, 5 (1998)
[42] Powell, S., Szklarczyk, D., Trachana, K., Roth, A., Kuhn, M., Muller, J., Arnold, R., Rattei, T., Letunic, I., Doerks, T., Jensen, L.J., von Mering, C., Bork, P.: eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res. 40(Database issue), D284-D289 (2012)
[43] Rahmann, S., Klau, G.W.: Integer linear programs for discovering approximate gene clusters. In: Proceedings of WABI 2006. LNBI, vol. 4175, pp. 298-309 (2006)
[44] Sankoff, D.: Edit distances for genome comparisons based on non-local operations. In: Proceedings of CPM 1992. LNCS, vol. 644, pp. 121-135 (1992)
[45] Sankoff, D.: Genome rearrangement with gene families. Bioinformatics 15(11), 909-917 (1999)
[46] Sankoff, D., Blanchette, M.: The median problem for breakpoints in comparative genomics. In: Proceedings of COCOON 1997. LNCS, vol. 1276, pp. 251-263 (1997) · Zbl 0889.92023
[47] Sankoff, D., Blanchette, M.: Multiple genome rearrangement and breakpoint phylogeny. J. Comput. Biol. 5, 555-570 (1998)
[48] Sankoff, D., El-Mabrouk, N.: Duplication, rearrangement and reconciliation. In: Sankoff, D., Nadeau, J.H. (eds.) Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment and the Evolution of Gene Families. Computational Biology Series, vol. 1, pp. 537-550. Kluwer Academic, Dordrecht (2000) · Zbl 1138.92362
[49] Sankoff, D., Cedergren, R., Abel, Y.: Genomic divergence through gene rearrangement. In: Doolittle, R.F. (ed.) Molecular Evolution: Computer Analysis of Protein and Nucleic Acid Sequences. Meth. Enzymol., vol. 183, Chap. 26, pp. 428-438. Academic Press, San Diego (1990)
[50] Schmidt, T., Stoye, J.: Quadratic time algorithms for finding common intervals in two and more sequences. In: Proceedings of CPM 2004. LNCS, vol. 3109, pp. 347-358 (2004) · Zbl 1104.92025
[51] Shi, G., Peng, M.C., Jiang, T.: MultiMSOAR 2.0: an accurate tool to identify ortholog groups among multiple genomes. PLoS ONE 6(6), e20892 (2011)
[52] Stoye, J., Wittler, R.: A unified approach for reconstructing ancient gene clusters. IEEE/ACM Trans. Comput. Biol. Bioinform. 6(3), 387-400 (2009)
[53] Tang, J., Moret, B.M., Cui, L., Depamphilis, C.W.: Phylogenetic reconstruction from arbitrary gene-order data. In: Proceedings of BIBE 2004, pp. 592-599. IEEE, New York (2004)
[54] Tannier, E., Zheng, C., Sankoff, D.: Multichromosomal median and halving problems under different genomic distances. BMC Bioinform. 10, 120 (2009)
[55] Tatusov, R.L., Fedorova, N.D., Jackson, J.D., Jacobs, A.R., Kiryutin, B., Koonin, E.V., Krylov, D.M., Mazumder, R., Mekhedov, S.L., Nikolskaya, A.N., Rao, B.S., Smirnov, S., Sverdlov, A.V., Vasudevan, S., Wolf, Y.I., Yin, J.J., Natale, D.A.: The COG database: an updated version includes eukaryotes. BMC Bioinform. 4, 41 (2003)
[56] Uno, T., Yagiura, M.: Fast algorithms to enumerate all common intervals of two permutations. Algorithmica 26(2), 290-309 (2000) · Zbl 0949.68168
[57] Wapinski, I., Pfeffer, A., Friedman, N., Regev, A.: Automatic genome-wide reconstruction of phylogenetic gene trees. Bioinformatics 23(13), i549-i558 (2007)
[58] Wapinski, I., Pfeffer, A., Friedman, N., Regev, A.: Natural history and evolutionary principles of gene duplication in fungi. Nature 449(7158), 54-61 (2007)
[59] Waterhouse, R.M., Zdobnov, E.M., Tegenfeldt, F., Li, J., Kriventseva, E.V.: OrthoDB: the hierarchical catalog of eukaryotic orthologs in 2011. Nucleic Acids Res. 39(Database issue), D283-D288 (2011)
[60] Watterson, G., Ewens, W.J., Hall, T., Morgan, A.: The chromosome inversion problem. J. Theor. Biol. 99(1), 1-7 (1982)
[61] Xu, A.W., Moret, B.M.E.: GASTS: parsimony scoring under rearrangements. In: Proceedings of WABI 2011. LNBI, vol. 6833, pp. 351-363 (2011)
[62] Xu, X., Sankoff, D.: Tests for gene clusters satisfying the generalized adjacency criterion. In: Proceedings of BSB 2008. LNBI, vol. 5167, pp. 152-160 (2008)
[63] Yancopoulos, S., Attie, O., Friedberg, R.: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21(16), 3340-3346 (2005)
[64] Yang, Z., Sankoff, D.: Natural parameter values for generalized gene adjacency. In: Proceedings of RECOMB-CG 2009. LNBI, vol. 5817, pp. 13-23 (2009)
[65] Zhang, M., Leong, H.W.: Identifying positional homologs as bidirectional best hits of sequence and gene context similarity. In: Proceedings of ISB 2011, pp. 117-122. IEEE, New York (2011)
[66] Zhu, B.: Approximability and fixed-parameter tractability for the exemplar genomic distance problems. In: Proc. of Theory and Applications of Models of Computation. LNCS, vol. 5532, pp. 71-80 (2009) · Zbl 1241.68073
[67] Zhu, Q., Adam, Z., Choi, V., Sankoff, D.: Generalized gene adjacencies, graph bandwidth, and clusters in yeast evolution. IEEE/ACM Trans. Comput. Biol. Bioinform
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.