FragGeneScan swMATH ID: 17159 Software Authors: Rho M, Tang H, Ye Y Description: FragGeneScan: predicting genes in short and error-prone reads. The advances of next-generation sequencing technology have facilitated metagenomics research that attempts to determine directly the whole collection of genetic material within an environmental sample (i.e. the metagenome). Identification of genes directly from short reads has become an important yet challenging problem in annotating metagenomes, since the assembly of metagenomes is often not available. Gene predictors developed for whole genomes (e.g. Glimmer) and recently developed for metagenomic sequences (e.g. MetaGene) show a significant decrease in performance as the sequencing error rates increase, or as reads get shorter. We have developed a novel gene prediction method FragGeneScan, which combines sequencing error models and codon usages in a hidden Markov model to improve the prediction of protein-coding region in short reads. The performance of FragGeneScan was comparable to Glimmer and MetaGene for complete genomes. But for short reads, FragGeneScan consistently outperformed MetaGene (accuracy improved ∼62 Homepage: http://nar.oxfordjournals.org/content/38/20/e191.short Related Software: ClustalW; SPAdes; JBROWSE; InterProScan; MAKER; EasyGene; PCAP; BioMart; AVID; BLAT; MUSCLE; DIALIGN; MAFFT; Rfam; R; RAST; TopHat; GTDB-Tk; CheckM; MetaWRAP Cited in: 5 Publications all top 5 Cited by 7 Authors 1 Axelson-Fisk, Marina 1 Chen, Bo 1 Ji, Ping 1 Keith, Jonathan M. 1 Lin, Yu 1 Mallawaarachchi, Vijini 1 Picardi, Ernesto Cited in 3 Serials 2 Methods in Molecular Biology 1 Computational Biology 1 Journal of Theoretical Biology Cited in 4 Fields 5 Biology and other natural sciences (92-XX) 2 General and overarching topics; collections (00-XX) 1 Probability theory and stochastic processes (60-XX) 1 Operations research, mathematical programming (90-XX) Citations by Year