swMATH ID: 12005
Software Authors: H. Huang, S. Tata, R. J. Prill
Description: BlueSNP: R package for highly scalable genome-wide association studies using Hadoop clusters. Computational workloads for genome-wide association studies (GWAS) are growing in scale and complexity outpacing the capabilities of single-threaded software designed for personal computers. The BlueSNP R package implements GWAS statistical tests in the R programming language and executes the calculations across computer clusters configured with Apache Hadoop, a de facto standard framework for distributed data processing using the MapReduce formalism. BlueSNP makes computationally intensive analyses, such as estimating empirical p-values via data permutation, and searching for expression quantitative trait loci over thousands of genes, feasible for large genotype-phenotype datasets. AVAILABILITY AND IMPLEMENTATION: http://github.com/ibm-bioinformatics/bluesnp
Homepage: http://www.ncbi.nlm.nih.gov/pubmed/23202745
Dependencies: R
Related Software: Crossbow; Probalign; k-means++; Jnomics; T-coffee; Kalign; MUSCLE; CloudAligner; CloudBurst; Seal; FX; Eoulsan; Hadoop-BAM; SeqWare; GATK; MrsRF; SOAP3; ClustalW; DIALIGN; ProbCons; CloudBLAST
