STAR swMATH ID: 2515 Software Authors: Alexander Dobin, Carrie A. Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha, Philippe Batut, Mark Chaisson, Thomas R. Gingeras Description: STAR: ultrafast universal RNA-seq aligner. Motivation: Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results: To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80–90 Homepage: http://bioinformatics.oxfordjournals.org/content/29/1/15.short Related Software: R; TopHat; edgeR; Bioconductor; Soap; FastQC; KEGG; Galaxy; HTSeq; HISAT; GATK; FactoMineR; MapSplice; RSEM; Samtools; Salmon; Bowtie 2; systemPipeR; Bismark; Gviz Cited in: 13 Publications all top 5 Cited by 42 Authors 1 Acuña, Vicente 1 Akalin, Altuna 1 Carugo, Oliviero 1 Davis, Sean L. 1 Eisenhaber, Frank 1 Faisal, Shahla 1 Feng, Ziding 1 Fernandes, Francisco E. jun. 1 Francisco, Alexandre P. 1 Franke, Verdan 1 Fu, Rong 1 Gazdar, Adi F. 1 Grossi, Roberto 1 Hanash, Samir M. 1 Italiano, Giuseppe Francesco 1 Kannan, Sreeram 1 Lin, Jimmy 1 Ma, Weiping 1 Mao, Shunfu 1 Mathé, Ewy 1 Mohajer, Soheil 1 Picardi, Ernesto 1 Ramachandran, Kannan 1 Rizzi, Romeo 1 Ronen, Jonathan 1 Sacomoto, Gustavo 1 Sagot, Marie-France 1 Shen, Carol 1 Shen, Tony 1 Shomroni, Orr 1 Sinaimeri, Blerina 1 Taguchi, Ayumu 1 Teixeira, Andreia Sofia 1 Tse, David N. C. 1 Tutz, Gerhard E. 1 Uyar, Bora 1 Wang, Pei 1 Wolff, Alexander 1 Wong, Chee-Hong 1 Zhang, Lingxiang 1 Zhong, Hua 1 Zhou, Qinghua all top 5 Cited in 7 Serials 3 Methods in Molecular Biology 1 Biometrics 1 Algorithmica 1 International Journal of Foundations of Computer Science 1 Statistical Papers 1 Statistical Applications in Genetics and Molecular Biology 1 Chapman & Hall/CRC Computational Biology Series all top 5 Cited in 7 Fields 12 Biology and other natural sciences (92-XX) 6 Statistics (62-XX) 3 Computer science (68-XX) 1 General and overarching topics; collections (00-XX) 1 Combinatorics (05-XX) 1 Numerical analysis (65-XX) 1 Game theory, economics, finance, and other social and behavioral sciences (91-XX) Citations by Year