×

Simultaneous rank tests for detecting differentially expressed genes. (English) Zbl 1510.62440

Summary: Rank tests are known to be robust to outliers and violation of distributional assumptions. Two major issues besetting microarray data are violation of the normality assumption and contamination by outliers. In this article, we formulate the normal theory simultaneous tests and their aligned rank transformation (ART) analog for detecting differentially expressed genes. These tests are based on the least-squares estimates of the effects when data follow a linear model. Application of the two methods are then demonstrated on a real data set. To evaluate the performance of the aligned rank transform method with the corresponding normal theory method, data were simulated according to the characteristics of a real gene expression data. These simulated data are then used to compare the two methods with respect to their sensitivity to the distributional assumption and to outliers for controlling the family-wise Type I error rate, power, and false discovery rate. It is demonstrated that the ART generally possesses the robustness of validity property even for microarray data with small number of replications. Although these methods can be applied to more general designs, in this article the simulation study is carried out for a dye-swap design since this design is broadly used in cDNA microarray experiments.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62G09 Nonparametric statistical resampling methods
62G10 Nonparametric hypothesis testing
62G35 Nonparametric robustness
62J15 Paired and multiple comparisons; multiple testing

Software:

bootstrap; sma
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Edwards D, Berry JJ. The efficiency of simulation-based multiple comparisons. Biometrics. 1987;43:913-928. doi: 10.2307/2531545[Crossref], [PubMed], [Web of Science ®], [Google Scholar] · Zbl 0715.62139
[2] Mansouri H. Simultaneous inference based on rank statistics in linear models. J Stat Comput Simul. 2015;85(4):660-674. doi: 10.1080/00949655.2013.836292[Taylor & Francis Online], [Web of Science ®], [Google Scholar] · Zbl 1457.62228
[3] Dudoit S, Yang YH, Callow MJ, Speed TP. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statist Sinica. 2002;12:111-139. [Web of Science ®], [Google Scholar] · Zbl 1004.62088
[4] Hsu JC, Chang JY, Wang T. Simultaneous confidence intervals for differential gene expressions. J Statist Plann Inference. 2006;136(7):2182-2196. doi: 10.1016/j.jspi.2005.08.029[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1087.62126
[5] Kerr MK, Afshari CA, Bennett L, Bushel P, Martinez J, Walker NJ, Churchill GA. Statistical analysis of a gene expression microarray experiment with replication. Statist Sinica. 2002;12:203-217. [Web of Science ®], [Google Scholar] · Zbl 1004.62083
[6] Yang YH, Dudoit S, Luu P, Speed TP. Normalization for cDNA microarray data. In: Bittner ML, Chen Y, Dorsel AN, Dougherty ER, editors. Microarrays: optical technologies and informatics. San Jose, CA: SPIE, Society for Optical Engineering; 2001. [Google Scholar]
[7] Kerr MK, Martin M, Churchill GA. Analysis of variance for gene expression microarray data. J Comput Biol. 2000;7(6):819-837. doi: 10.1089/10665270050514954[Crossref], [PubMed], [Web of Science ®], [Google Scholar]
[8] Kerr MK, Churchill GA. Experimental design for gene expression microarrays. Biostatistics. 2001;2(2):183-201. doi: 10.1093/biostatistics/2.2.183[Crossref], [PubMed], [Google Scholar] · Zbl 1097.62562
[9] Kerr MK, Leiter EH, Picard L, Churchill GA. Computational and statistical approaches to genomics. 2nd ed.New York: Springer; 2006. [Google Scholar] · Zbl 1220.62145
[10] Fritsch K, Hsu JC. On analysis of means. In: Balakrishnan, Panchapakesan, editors. Advances in statistical decision theory and methodology. Boston: Birkhauser; 1997. [Google Scholar]
[11] Soong WC. Exact simultaneous confidence intervals for multiple comparisons with the mean. Comput Statist Data Anal. 2001;37:33-47. doi: 10.1016/S0167-9473(00)00062-1[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1030.62058
[12] Bretz F, Genz A, Hothorn LA. On the numerical availability of multiple comparison procedures. Biom J. 2001;43:645-656. doi: 10.1002/1521-4036(200109)43:5<645::AID-BIMJ645>3.0.CO;2-F[Crossref], [Web of Science ®], [Google Scholar] · Zbl 0978.62058
[13] Efron B, Tibshirani RJ. An introduction to the bootstrap. New York: Chapman and Hall; 1993. [Crossref], [Google Scholar] · Zbl 0835.62038
[14] Mansouri H. Aligned rank transform tests in linear models. J Statist Plann Inference. 1999;79:141-155. doi: 10.1016/S0378-3758(98)00229-8[Crossref], [Web of Science ®], [Google Scholar] · Zbl 0933.62037
[15] Hochberg Y, Tamhane AC. Multiple comparison procedures. New York: Wiley; 1987. [Crossref], [Google Scholar] · Zbl 0731.62125
[16] Dudoit S, Shaffer JP, Boldrick JC. Multiple hypothesis testing in microarray experiments. Statist Sci. 2003;18(1):71-103. doi: 10.1214/ss/1056397487[Crossref], [Web of Science ®], [Google Scholar] · Zbl 1048.62099
[17] Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. 1995;57:289-300. [Crossref], [Google Scholar] · Zbl 0809.62014
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.