×

Frozen robust multiarray analysis (fRMA). (English) Zbl 1437.62556

Summary: Robust multiarray analysis (RMA) is the most widely used preprocessing algorithm for Affymetrix and Nimblegen gene expression microarrays. RMA performs background correction, normalization, and summarization in a modular way. The last 2 steps require multiple arrays to be analyzed simultaneously. The ability to borrow information across samples provides RMA various advantages. For example, the summarization step fits a parametric model that accounts for probe effects, assumed to be fixed across arrays, and improves outlier detection. Residuals, obtained from the fitted model, permit the creation of useful quality metrics. However, the dependence on multiple arrays has 2 drawbacks: (1) RMA cannot be used in clinical settings where samples must be processed individually or in small batches and (2) data sets preprocessed separately are not comparable. We propose a preprocessing algorithm, frozen RMA (fRMA), which allows one to analyze microarrays individually or in small batches and then combine the data for analysis. This is accomplished by utilizing information from the large publicly available microarray databases. In particular, estimates of probe-specific effects and variances are precomputed and frozen. Then, with new data sets, these are used in concert with information from the new arrays to normalize and summarize the data. We find that fRMA is comparable to RMA when the data are analyzed as a single batch and outperforms RMA when analyzing multiple batches. The methods described here are implemented in the R package fRMA and are currently available for download from the software section of http://rafalab.jhsph.edu.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Guide to probe logarithmic intensity error (PLIER) estimation (2005)
[2] Low-level analysis of high-density oligonucleotide array data: background, normalization and summarization, [PhD. Thesis] (2004)
[3] A comparison of normalization methods for high density oligonucleotide array data based on variance and bias, 19, 185-193 (2003) · doi:10.1093/bioinformatics/19.2.185
[4] Quality assessment for short oligonucleotide microarray data, 50, 241-264 (2008) · doi:10.1198/004017008000000334
[5] Gene Expression Omnibus: NCBI gene expression and hybridization array data repository, 30, 207 (2002) · doi:10.1093/nar/30.1.207
[6] Effects of atmospheric ozone on microarray data quality, 75, 4672 (2003) · doi:10.1021/ac034241b
[7] affy-analysis of Affymetrix GeneChip data at the probe level, 20, 307-315 (2004) · doi:10.1093/bioinformatics/btg405
[8] (1981) · Zbl 0536.62025 · doi:10.1002/0471725250
[9] Exploration, normalization, and summaries of high density oligonucleotide array probe level data, 4, 249 (2003) · Zbl 1141.62348 · doi:10.1093/biostatistics/4.2.249
[10] Multiple-laboratory comparison of microarray platforms, 2, 345-350 (2005) · doi:10.1038/nmeth756
[11] Comparison of Affymetrix GeneChip expression measures, 22, 789-794 (2006) · doi:10.1093/bioinformatics/btk046
[12] A summarization approach for Affymetrix GeneChip data using a reference training set from a large, biologically diverse database, 7, 464 (2006) · doi:10.1186/1471-2105-7-464
[13] Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection, 98, 31 (2001) · Zbl 0990.62091 · doi:10.1073/pnas.98.1.31
[14] Consolidated strategy for the analysis of microarray spike-in data, 36, e108 (2008) · doi:10.1093/nar/gkn430
[15] ArrayExpress update-from an archive of functional genomics experiments to the atlas of gene expression, 37, D868-D872 (2008) · doi:10.1093/nar/gkn889
[16] Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts, 7, R953-R964 (2005) · doi:10.1186/bcr1325
[17] A gene atlas of the mouse and human protein-encoding transcriptomes, 101, 6062-6067 (2004) · doi:10.1073/pnas.0400782101
[18] (1977)
[19] A model-based background adjustment for oligonucleotide expression arrays, 99, 909-917 (2004) · Zbl 1055.62129 · doi:10.1198/016214504000000683
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.