SHARCGS swMATH ID: 29573 Software Authors: Dohm, J. C., Lottaz, C., Borodina, T. Description: SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing. The latest revolution in the DNA sequencing field has been brought about by the development of automated sequencers that are capable of generating giga base pair data sets quickly and at low cost. Applications of such technologies seem to be limited to resequencing and transcript discovery, due to the shortness of the generated reads. In order to extend the fields of application to de novo sequencing, we developed the SHARCGS algorithm to assemble short-read (25–40-mer) data with high accuracy and speed. The efficiency of SHARCGS was tested on BAC inserts from three eukaryotic species, on two yeast chromosomes, and on two bacterial genomes (Haemophilus influenzae, Escherichia coli). We show that 30-mer-based BAC assemblies have N50 sizes >20 kbp for Drosophila and Arabidopsis and >4 kbp for human in simulations taking missing reads and wrong base calls into account. We assembled 949,974 contigs with length >50 bp, and only one single contig could not be aligned error-free against the reference sequences. We generated 36-mer reads for the genome of Helicobacter acinonychis on the Illumina 1G sequencing instrument and assembled 937 contigs covering 98 Homepage: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2045152/ Related Software: Velvet; ALLPATHS; SSAKE; ARACHNE; BLAT; QSRA; ABySS; IDBA; PE-Assembler; IDBA-UD; SOAPdenovo; SPAdes; GAGE; SpliceTrap; DWE; CLIPZ; mirTools; miRExpress; TargetSpy; miRNAkey Cited in: 2 Publications all top 5 Cited by 6 Authors 1 Aransay, Ana M. 1 Hackenberg, Michael 1 Jean, Géraldine 1 Radulescu, Andreea 1 Rodríguez-Ezpeleta, Naiara 1 Rusu, Irena Cited in 0 Serials Cited in 3 Fields 2 Biology and other natural sciences (92-XX) 1 General and overarching topics; collections (00-XX) 1 Computer science (68-XX) Citations by Year