A complexity and approximation framework for the maximization scaffolding problem.(English)Zbl 1328.68085

Summary: We explore in this paper some complexity issues inspired by the contig scaffolding problem in bioinformatics. We focus on the following problem: given an undirected graph with no loop, and a perfect matching on this graph, find a set of cycles and paths covering every vertex of the graph, with edges alternatively in the matching and outside the matching, and satisfying a given constraint on the numbers of cycles and paths. We show that this problem is $$\mathcal{NP}$$-complete, even in planar bipartite graphs. Moreover, we show that there is no subexponential-time algorithm for several related problems unless the exponential-time hypothesis fails. Lastly, we also design two polynomial-time approximation algorithms for complete graphs.

MSC:

 68Q25 Analysis of algorithms and problem complexity 05C85 Graph algorithms (graph-theoretic aspects) 68Q17 Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.) 68W25 Approximation algorithms

Software:

Ragout; SOPRA; GRASS; SCARPA
Full Text:

References:

 [1] Burton, J. N.; Adey, A.; Patwardhan, R. P.; Qiu, R.; Kitzman, J. O.; Shendure, J., Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions, Nat. Biotechnol., 1119-1125, (November 2013) [2] Chateau, A.; Giroudeau, R., Complexity and polynomial-time approximation algorithms around the scaffolding problem, (Horia Dediu, A.; Martín-Vide, C.; Truthe, B., AlCoB, Lecture Notes in Computer Science, vol. 8542, (2014), Springer), 47-58 [3] Chateau, A.; Giroudeau, R., On the implementation of polynomial-time approximation algorithms for scaffold problems, IEEE Trans. Comput. Biol. Bioinform., (2015), Special issue of ALCOB 2014, submitted for publication · Zbl 1328.68085 [4] Chauve, C.; Patterson, M.; Rajaraman, A., Hypergraph covering problems motivated by genome assembly questions, (IWOCA 2013, (2013)), 428-432 · Zbl 1408.05094 [5] Chiba, S.; Fujita, S., Covering vertices by a specified number of disjoint cycles, edges and isolated vertices, Discrete Math., 313, 3, 269-277, (2013) · Zbl 1257.05126 [6] Dayarian, A.; Michael, T. P.; Sengupta, A. M., SOPRA: scaffolding algorithm for paired reads via statistical optimization, BMC Bioinformatics, 11, 345, (2010) [7] Donmez, N.; Brudno, M., SCARPA: scaffolding reads with practical algorithms, Bioinformatics, 29, 4, 428-434, (2013) [8] Gabow, H. N., An efficient implementation of edmonds’ algorithm for maximum matching on graphs, J. ACM, 23, 2, 221-234, (1976) · Zbl 0327.05121 [9] Gao, S.; Sung, W.; Nagarajan, N., Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences, J. Comput. Biol., 18, 11, 1681-1691, (2011) [10] Garey, M. R.; Johnson, D. S., Computers and intractability: A guide to the theory of NP-completeness, (1979), W. H. Freeman & Co. New York, NY, USA · Zbl 0411.68039 [11] Garey, M. R.; Johnson, D. S.; Stockmeyer, L., Some simplified NP-complete problems, (Proceedings of the Sixth Annual ACM Symposium on Theory of Computing, STOC ’74, (1974), ACM New York, NY, USA), 47-63 [12] Gritsenko, A. A.; Nijkamp, J. F.; Reinders, M. J.T.; de Ridder, D., GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies, Bioinformatics, 1429-1437, (2012) [13] Husemann, P.; Stoye, J., Phylogenetic comparative assembly, Algorithms Mol. Biol., 5, 3, (2010) [14] Huson, D. H.; Reinert, K.; Myers, E. W., The greedy path-merging algorithm for contig scaffolding, J. ACM, 49, 5, 603-615, (September 2002) · Zbl 1326.92052 [15] Impagliazzo, R.; Paturi, R., On the complexity of k-SAT, J. Comput. System Sci., 62, 2, 367-375, (2001) · Zbl 0990.68079 [16] Impagliazzo, R.; Paturi, R.; Zane, F., Which problems have strongly exponential complexity?, J. Comput. System Sci., 63, 4, 512-530, (2001) · Zbl 1006.68052 [17] Kolmogorov, M.; Raney, B. J.; Paten, B.; Pham, S. K., Ragout - a reference-assisted assembly tool for bacterial genomes, Bioinformatics, 30, 12, 302-309, (2014) [18] Krivelevich, M.; Nutov, Z.; Salavatipour, M. R.; Yuster, J. V.; Yuster, R., Approximation algorithms and hardness results for cycle packing problems, ACM Trans. Algorithms, 3, 4, (November 2007) [19] Lenstra, J. K.; Shmoys, D. B., Computing near-optimal schedules, (Scheduling Theory and Its Applications, (1995), John Wiley & Sons), Chapter 1 [20] Lokshtanov, D.; Marx, D.; Saurabh, S., Lower bounds based on the exponential time hypothesis, Bull. Eur. Assoc. Theor. Comput. Sci. EATCS, 105, 41-72, (2011) · Zbl 1258.68068 [21] Moran, S.; Newman, I.; Wolfstahl, Y., Approximation algorithms for covering a graph by vertex-disjoint paths of maximum total weight, Networks, 20, 1, 55-64, (1990) · Zbl 0689.90072 [22] Shmoys, D. B.; Lenstra, J. K.; Rinnooy Kan, A. H.G.; Lawler, E. L., The traveling salesman problem: A guided tour of combinatorial optimization, (1985), John Wiley & Sons · Zbl 0562.00014 [23] Steiner, G., On the k-path partition of graphs, Theoret. Comput. Sci., 290, 2147-2155, (2003) · Zbl 1044.68134 [24] Tutte, W. T., A short proof of the factor theorem for finite graphs, Canad. J. Math., 6, 347-352, (1954) · Zbl 0055.17102 [25] van Rooij, J. M.M.; van Kooten Niekerk, M. E.; Bodlaender, H. L., Partition into triangles on bounded degree graphs, Theory Comput. Syst., 52, 4, 687-718, (2013) · Zbl 1286.68214 [26] Vazirani, V. V., Approximation algorithms, (2001), Springer-Verlag New York, Inc. New York, NY, USA · Zbl 1138.90417 [27] Woeginger, G. J., Exact algorithms for NP-hard problems: a survey, (Combinatorial Optimization, (2001)), 185-208
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.