×

A multivariate empirical Bayes statistic for replicated microarray time course data. (English) Zbl 1106.62008

Summary: We derive one- and two-sample multivariate empirical Bayes statistics (the \(MB\)-statistics) to rank genes in order of interest from longitudinal replicated developmental microarray time course experiments. We first use conjugate priors to develop our one-sample multivariate empirical Bayes framework for the null hypothesis that the expected temporal profile stays at 0. This leads to our one-sample \(MB\)-statistic and a one-sample \(\widetilde T^2\)-statistic, a variant of the one-sample Hotelling \(T^2\)-statistic.
Both the \(MB\)-statistic and \(\widetilde T^2\)-statistic can be used to rank genes in the order of evidence of nonzero mean, incorporating the correlation structure across time points, moderation and replication. We also derive the corresponding \(MB\)-statistics and \(\widetilde T^2\)-statistics for the one-sample problem where the null hypothesis states that the expected temporal profile is constant, and for the two-sample problem where the null hypothesis is that two expected temporal profiles are the same.

MSC:

62C12 Empirical decision procedures; empirical Bayes procedures
62P10 Applications of statistics to biology and medical sciences; meta analysis
92D10 Genetics and epigenetics
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)

Software:

sma

References:

[1] Aitchison, J. and Dunsmore, I. R. (1975). Statistical Prediction Analysis . Cambridge Univ. Press. · Zbl 0327.62043
[2] Baldi, P. and Long, A. D. (2001). A Bayesian framework for the analysis of microarray expression data: Regularized \(t\)-test and statistical inferences of gene changes. Bioinformatics 17 509–519.
[3] Bar-Joseph, Z., Gerber, G., Simon, I., Gifford, D. K. and Jaakkola, T. S. (2003). Comparing the continuous representation of time-series expression profiles to identify differentially expressed genes. Proc. Natl. Acad. Sci. USA 100 10,146–10,151. · Zbl 1130.62368 · doi:10.1073/pnas.1732547100
[4] Bickel, P. J. and Doksum, K. A. (2001). Mathematical Statistics : Basic Ideas and Selected Topics , 2nd ed. 1 . Prentice Hall, Upper Saddle River, NJ. · Zbl 0403.62001
[5] Bolstad, B., Irizarry, R., Âstrand, M. and Speed, T. (2003). A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics 19 185–193.
[6] Broberg, P. (2003). Statistical methods for ranking differentially expressed genes. Genome Biology 4 R41.
[7] Cho, R., Campbell, M., Winzeler, E., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T., Gabrielian, A., Landsman, D., Lockhart, D. and Davis, R. (1998). A genome-wide transcriptional analysis of the mitotic cell cycle. Molecular Cell 2 65–73.
[8] Cho, R., Huang, M., Campbell, M., Dong, H., Steinmetz, L., Sapinoso, L., Hampton, G., Elledge, S., Davis, R. and Lockhart, D. (2001). Transcriptional regulation and function during the human cell cycle. Nature Genetics 27 48–54.
[9] Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P. O. and Herskowitz, I. (1998). The transcriptional program of sporulation in budding yeast. Science 282 699–705.
[10] Diggle, P. J. (1990). Time Series : A Biostatistical Introduction . Oxford Univ. Press, New York. · Zbl 0727.62083
[11] Diggle, P. J., Heagerty, P., Liang, K.-Y. and Zeger, S. L. (2002). Analysis of Longitudinal Data , 2nd ed. Oxford Univ. Press, New York. · Zbl 1031.62002
[12] Dudoit, S., Fridlyand, J. and Speed, T. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data. J. Amer. Statist. Assoc. 97 77–87. JSTOR: · Zbl 1073.62576 · doi:10.1198/016214502753479248
[13] Dudoit, S., Yang, Y. H., Callow, M. and Speed, T. (2002). Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statist. Sinica 12 111–139. · Zbl 1004.62088
[14] Efron, B., Tibshirani, R., Storey, J. D. and Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. J. Amer. Statist. Assoc. 96 1151–1160. JSTOR: · Zbl 1073.62511 · doi:10.1198/016214501753382129
[15] Guo, X., Qi, H., Verfaillie, C. M. and Pan, W. (2003). Statistical significance analysis of longitudinal gene expression data. Bioinformatics 19 1628–1635.
[16] Gupta, A. and Nagar, D. (2000). Matrix Variate Distributions . Chapman and Hall/CRC, Boca Raton, FL. · Zbl 0935.62064
[17] Hong, F. and Li, H. (2006). Functional hierarchical models for identifying genes with different time-course expression profiles. Biometrics 62 534–544. · Zbl 1097.62127 · doi:10.1111/j.1541-0420.2005.00505.x
[18] Irizarry, R. A., Bolstad, B. M., Collin, F., Cope, L. M., Hobbs, B. and Speed, T. P. (2003). Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 31 e15.
[19] Kendziorski, C., Newton, M., Lan, H. and Gould, M. (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Statistics in Medicine 22 3899–3914.
[20] Lönnstedt, I. and Speed, T. P. (2002). Replicated microarray data. Statist. Sinica 12 31–46. · Zbl 1004.62086
[21] Mardia, K., Kent, J. and Bibby, J. (1979). Multivariate Analysis . Academic Press, New York. · Zbl 0432.62029
[22] Park, T., Yi, S.-G., Lee, S., Lee, S. Y., Yoo, D.-H., Ahn, J.-I. and Lee, Y.-S. (2003). Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics 19 694–703.
[23] Reiner, A., Yekutieli, D. and Benjamini, Y. (2003). Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19 368–375.
[24] Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3 article 3. · Zbl 1038.62110 · doi:10.2202/1544-6115.1027
[25] Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D. and Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9 3273–3297.
[26] Storch, K.-F., Lipan, O., Leykin, I., Viswanathan, N., Davis, F. C., Wong, W. H. and Weitz, C. J. (2002). Extensive and divergent circadian gene expression in liver and heart. Nature 417 78–83.
[27] Storey, J., Xiao, W., Leek, J. T., Tompkins, R. G. and Davis, R. W. (2005). Significance analysis of time course microarray experiments. Proc. Natl. Acad. Sci. USA 102 12,837–12,842.
[28] Tai, Y. C. (2005). Multivariate empirical Bayes models for replicated microarray time course data. Ph.D. dissertation, Div. Biostatistics, Univ. California, Berkeley.
[29] Tai, Y. C. and Speed, T. P. (2005). Statistical analysis of microarray time course data. In DNA Microarrays (U. Nuber, ed.) Chapter 20. Chapman and Hall/CRC, New York.
[30] Tai, Y. C. and Speed, T. P. (2005). Longitudinal microarray time course \(\mathitMB\)-statistic for multiple biological conditions. Dept. Statistics, Univ. California, Berkeley. In preparation.
[31] Tai, Y. C. and Speed, T. P. (2005). Cross-sectional microarray time course \(\mathitMB\)-statistic. Dept. Statistics, Univ. California, Berkeley. In preparation.
[32] Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E. S. and Golub, T. R. (1999). Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. USA 96 2907–2912.
[33] Tusher, V. G., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 98 5116–5121. · Zbl 1012.92014 · doi:10.1073/pnas.091062498
[34] Wen, X., Fuhrman, S., Michaels, G. S., Carr, D. B., Smith, S., Barker, J. L. and Somogyi, R. (1998). Large-scale temporal gene expression mapping of central nervous system development. Proc. Natl. Acad. Sci. USA 95 334–339.
[35] Wildermuth, M. C., Tai, Y. C., Dewdney, J., Denoux, C., Hather, G., Speed, T. P. and Ausubel, F. M. (2006). Application of \(\widetildeT^2\) statistic to temporal global Arabidopsis expression data reveals known and novel salicylate-impacted processes. · Zbl 1106.62008 · doi:10.1214/009053606000000759
[36] Yuan, M. and Kendziorski, C. (2006). Hidden Markov models for microarray time course data in multiple biological conditions (with discussion). J. Amer. Statist. Assoc. 101 1323–1340. · Zbl 1171.62359 · doi:10.1198/016214505000000394
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.