×

PanCake: a data structure for pangenomes. (English) Zbl 1281.92047

Beißbarth, Tim (ed.) et al., German conference on bioinformatics 2013, GCB’13, Göttingen, Germany, September 10–13, 2013. Selected papers based on the presentations at the conference. Wadern: Schloss Dagstuhl – Leibniz Zentrum für Informatik (ISBN 978-3-939897-59-0). OASIcs – OpenAccess Series in Informatics 34, 35-45, electronic only (2013).
Summary: We present a pangenome data structure (“PanCake”) for sets of related genomes, based on bundling similar sequence regions into shared features, which are derived from genome-wide pairwise sequence alignments. We discuss the design of the data structure, basic operations on it and methods to predict core genomes and singleton regions. In contrast to many other pangenome analysis tools, like EDGAR or PGAT, PanCake is independent of gene annotations. Nevertheless, comparison of identified core and singleton regions shows good agreements. The PanCake data structure requires significantly less space than the sum of individual sequence files.
For the entire collection see [Zbl 1279.92004].

MSC:

92D10 Genetics and epigenetics
68P05 Data structures

Software:

PanCake; PGAT; EDGAR
PDF BibTeX XML Cite
Full Text: DOI