×

Bayesian testing of many hypotheses \(\times \) many genes: a study of sleep apnea. (English) Zbl 1196.62140

Summary: Substantial statistical research has recently been devoted to the analysis of large-scale microarray experiments which provide a measure of the simultaneous expression of thousands of genes in a particular condition. A typical goal is the comparison of gene expressions between two conditions (e.g., diseased vs. nondiseased) to detect genes which show differential expressions. Classical hypothesis testing procedures have been applied to this problem and more recent work has employed sophisticated models that allow for the sharing of information across genes. However, many recent gene expression studies have an experimental design with several conditions that require an even more involved hypothesis testing approach.
We use a hierarchical Bayesian model to address the situation where there are many hypotheses that must be simultaneously tested for each gene. In addition to having many hypotheses within each gene, our analysis also addresses the more typical multiple comparison issue of testing many genes simultaneously. We illustrate our approach with an application to a study of genes involved in obstructive sleep apnea in humans.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
62J15 Paired and multiple comparisons; multiple testing
92C50 Medical applications (general)

Software:

gcrma

References:

[1] Alizadeh, A., Eisen, M., Davis, R., Ma, C., Lossos, I., Rosenwald, A., Boldrick, J., Sabet, H., Tran, T., Yu, X., Powell, J., Yang, L., Marti, G., Moore, T., Hudson, J., Lu, L., Lewis, D., Tibshirani, R., Sherlock, G., Chan, W., Greiner, T., Weisenburger, D., Armitage, J., Warnke, R., Levy, R., Wilson, W., Grever, M., Byrd, J., Botstein, D., Brown, P. and Staudt, L. (2000). Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403 503-511.
[2] Bickel, P. and Doksum, K. (2007). Mathematical Statistics: Basic Ideas and Selected Topics 1 , 2nd ed. Prentice Hall, Upper Saddle River, NJ. · Zbl 0403.62001
[3] Dempster, A., Laird, N. and Rubin, D. (1977). Maximum likelihood from incomplete data via the em algorithm. J. Roy. Statist. Soc. Ser. B 39 1-38. · Zbl 0364.62022
[4] Dudoit, S., Gentleman, R. C. and Quackenbush, J. (2003). Open source software for the analysis of microarray data. BioTechniques 34 S45-S51.
[5] Flury, B. K. and Riedwyl, H. (1986). Standard distance in univariate and multivariate analysis. Amer. Statist. 40 214-215.
[6] Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transaction on Pattern Analysis and Machine Intelligence 6 721-741. · Zbl 0573.62030 · doi:10.1109/TPAMI.1984.4767596
[7] Gottardo, R., Raftery, A. E., Yeung, K. Y. and Bumgarner, R. E. (2006). Bayesian robust inference for differential gene expression in microarrays with multiple samples. Biometrics 62 10-18. · Zbl 1099.62128 · doi:10.1111/j.1541-0420.2005.00397.x
[8] Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U. and Speed, T. P. (2006). Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4 249-264. · Zbl 1141.62348 · doi:10.1093/biostatistics/4.2.249
[9] Jensen, S. T., Erkan, I., Arnadottir, E. S. and Small, D. S. (2009). Supplement to “Bayesian testing of many hypothesis \times many genes: A study of sleep apnea.” DOI: 10.1214/09-AOAS241SUPP.
[10] Kendziorski, C. M., Newton, M. A., Lan, H. and Gould, M. N. (2003). On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression pr. Stat. Med. 22 3899-3914.
[11] Ma, P., Castillo-Davis, C., Zhong, W. and Liu, J. S. (2006). A data-driven clustering method for time course gene expression data. Nucleic Acids Research 34 1261-1269.
[12] Medvedovic, M. and Sivaganesan, S. (2002). Bayesian infinite mixture models based clustering of gene expression profiles. Bioinformatics 18 1194-1206.
[13] Newton, M., Kendziorski, C. M., Richmond, C. S., Blattner, F. R. and Tsui, K. W. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from microarray data. J. Comput. Biol. 8 37-52.
[14] Newton, M., Noueiry, A., Sarkar, D. and Ahlquist, P. (2004). Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics 5 155-176. · Zbl 1096.62124 · doi:10.1093/biostatistics/5.2.155
[15] Pack, A. I. (2006). Advances in sleep-disordered breathing. Am. J. Respir. Crit. Care. Med. 173 7-15.
[16] Segal, E., Shapira, M., Regev, A., Pe’er, D., Botstein, D., Koller, D. and Friedman, N. (2003). Module networks: Identifying regulatory modules and their condition-specific regulators from gene expression data. Nature Genetics 34 166-176.
[17] Smyth, G. (2004). Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3 3. · Zbl 1038.62110 · doi:10.2202/1544-6115.1027
[18] Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. 100 9440-9445. · Zbl 1130.62385 · doi:10.1073/pnas.1530509100
[19] Wu, Z., Irizarry, R. A., Gentleman, R., Murillo, F. M. and Spencer, F. (2004). A model based background adjustment for oligonucleotide expression arrays. Technical Report Paper 1, Dept. Biostatistics, Johns Hopkins Univ. · Zbl 1055.62129
[20] Yuan, M. and Kendziorski, C. (2006). A unified approach for simultaneous gene clustering and differential expression identification. Biometrics 62 1089-1098. · Zbl 1114.62130 · doi:10.1111/j.1541-0420.2006.00611.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.