×

EBADIMEX: an empirical Bayes approach to detect joint differential expression and methylation and to classify samples. (English) Zbl 1445.92115

Summary: DNA methylation and gene expression are interdependent and both implicated in cancer development and progression, with many individual biomarkers discovered. A joint analysis of the two data types can potentially lead to biological insights that are not discoverable with separate analyses. To optimally leverage the joint data for identifying perturbed genes and classifying clinical cancer samples, it is important to accurately model the interactions between the two data types. Here, we present EBADIMEX for jointly identifying differential expression and methylation and classifying samples. The moderated t-test widely used with empirical Bayes priors in current differential expression methods is generalised to a multivariate setting by developing: (1) a moderated Welch t-test for equality of means with unequal variances; (2) a moderated F-test for equality of variances; and (3) a multivariate test for equality of means with equal variances. This leads to parametric models with prior distributions for the parameters, which allow fast evaluation and robust analysis of small data sets. EBADIMEX is demonstrated on simulated data as well as a large breast cancer (BRCA) cohort from TCGA. We show that the use of empirical Bayes priors and moderated tests works particularly well on small data sets.

MSC:

92C40 Biochemistry, molecular biology
92C50 Medical applications (general)
62P10 Applications of statistics to biology and medical sciences; meta analysis
62F15 Bayesian inference
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aryee, M. J., A. E. Jaffe, H. Corrada-Bravo, C. Ladd-Acosta, A. P. Feinberg, K. D. Hansen and R. A. Irizarry (2014): “Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays,” Bioinformatics, 30, 1363-1369.; Aryee, M. J.; Jaffe, A. E.; Corrada-Bravo, H.; Ladd-Acosta, C.; Feinberg, A. P.; Hansen, K. D.; Irizarry, R. A., Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays, Bioinformatics, 30, 1363-1369 (2014)
[2] Bailer-Jones, C. and K. Smith (2011): Combining probabilities. Data Processing and Analysis Consortium (DPAS), GAIA-C8-TN-MPIA-CBJ-053.; Bailer-Jones, C.; Smith, K., Combining probabilities. Data Processing and Analysis Consortium (DPAS) (2011)
[3] Bibikova, M., B. Barnes, C. Tsan, V. Ho, B. Klotzle, J. M. Le, D. Delano, L. Zhang, G. P. Schroth, K. L. Gunderson, J. B. Fan and R. Shen (2011): “High density DNA methylation array with single CpG site resolution,” Genomics, 98, 288-295.; Bibikova, M.; Barnes, B.; Tsan, C.; Ho, V.; Klotzle, B.; Le, J. M.; Delano, D.; Zhang, L.; Schroth, G. P.; Gunderson, K. L.; Fan, J. B.; Shen, R., High density DNA methylation array with single CpG site resolution, Genomics, 98, 288-295 (2011)
[4] Breiman, L., A. Cutler, A. Liaw and M. Wiener (2006): “randomforest: Breiman and cutler’s random forests for classification and regression.”; Breiman, L.; Cutler, A.; Liaw, A.; Wiener, M., randomforest: Breiman and cutler’s random forests for classification and regression (2006)
[5] Brenet, F., M. Moh, P. Funk, E. Feierstein, A. J. Viale, N. D. Socci and J. M. Scandura (2011): “DNA methylation of the first exon is tightly linked to transcriptional silencing,” PloS One, 6, e14524.; Brenet, F.; Moh, M.; Funk, P.; Feierstein, E.; Viale, A. J.; Socci, N. D.; Scandura, J. M., DNA methylation of the first exon is tightly linked to transcriptional silencing, PloS One, 6 (2011)
[6] Bullard, J. H., E. Purdom, K. D. Hansen and S. Dudoit (2010): “Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments,” BMC Bioinformatics, 11, 94.; Bullard, J. H.; Purdom, E.; Hansen, K. D.; Dudoit, S., Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments, BMC Bioinformatics, 11, 94 (2010)
[7] Dedeurwaerder, S., M. Defrance, E. Calonne, H. Denis, C. Sotiriou and F. Fuks (2011): “Evaluation of the Infinium Methylation 450k Technology,” Epigenomics, 3, 771-784.; Dedeurwaerder, S.; Defrance, M.; Calonne, E.; Denis, H.; Sotiriou, C.; Fuks, F., Evaluation of the Infinium Methylation 450k Technology, Epigenomics, 3, 771-784 (2011)
[8] Demissie, M., B. Mascialino, S. Calza and Y. Pawitan (2008): “Unequal group variances in microarray data analyses,” Bioinformatics, 24, 1168-1174.; Demissie, M.; Mascialino, B.; Calza, S.; Pawitan, Y., Unequal group variances in microarray data analyses, Bioinformatics, 24, 1168-1174 (2008)
[9] Ding, J., , M. K. McConechy, H. M. Horlings, G. Ha, F. C. Chan, T. Funnell, S. C. Mullaly, J. Reimand, A. Bashashati, G. D. Bader, D. Huntsman, S. Aparicio, A. Condon and S. P. Shah (2015): “Systematic analysis of somatic mutations impacting gene expression in 12 tumour types,” Nat. Commun., 6, 8554.; Ding, J.; McConechy, M. K.; Horlings, H. M.; Ha, G.; Chan, F. C.; Funnell, T.; Mullaly, S. C.; Reimand, J.; Bashashati, A.; Bader, G. D.; Huntsman, D.; Aparicio, S.; Condon, A.; Shah, S. P., Systematic analysis of somatic mutations impacting gene expression in 12 tumour types, Nat. Commun, 6, 8554 (2015)
[10] Dixon, W. J. and J. W. Tukey (1968): “Approximate behavior of the distribution of Winsorized t (trimming/winsorization 2),” Technometrics, 10, 83-98.; Dixon, W. J.; Tukey, J. W., “Approximate behavior of the distribution of Winsorized t (trimming/winsorization 2),”, Technometrics, 10, 83-98 (1968)
[11] Du, P., X. Zhang, C.-C. Huang, N. Jafari, W. A. Kibbe, L. Hou and S. M. Lin (2010): “Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis,” BMC Bioinformatics., 11, 587.; Du, P.; Zhang, X.; Huang, C.-C.; Jafari, N.; Kibbe, W. A.; Hou, L.; Lin, S. M., Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis, BMC Bioinformatics, 11, 587 (2010)
[12] Esteller, M. (2008): “Epigenetics in cancer,” N. Engl. J. Med., 358, 1148-1159.; Esteller, M., Epigenetics in cancer, N. Engl. J. Med., 358, 1148-1159 (2008)
[13] Fisher, R. A. (1932): Statistical methods for research workers, Oliver and Boyd, Edinburgh.; Fisher, R. A., Statistical methods for research workers (1932) · JFM 58.1161.04
[14] Gelman, A. (2011): Arm: Data analysis using regression and multilevel/hierarchical models. http://cran. r-project. org/web/packages/arm.; Gelman, A., Arm: Data analysis using regression and multilevel/hierarchical models (2011)
[15] Grossman, R. L., A. P. Heath, V. Ferretti, H. E. Varmus, D. R. Lowy, W. A. Kibbe and L. M. Staudt (2016): “Toward a shared vision for cancer genomic data,” N. Engl. J. Med., 375, 1109-1112.; Grossman, R. L.; Heath, A. P.; Ferretti, V.; Varmus, H. E.; Lowy, D. R.; Kibbe, W. A.; Staudt, L. M., Toward a shared vision for cancer genomic data, N. Engl. J. Med., 375, 1109-1112 (2016)
[16] Huber, P. and E. Ronchetti (2009): Robust statistics, John Wiley & Sons, Inc., Hoboken, NJ, USA.; Huber, P.; Ronchetti, E., Robust statistics (2009) · Zbl 1276.62022
[17] Jeong, J., L. Li, Y. Liu, K. P. Nephew, T. H.-M. Huang and C. Shen (2010): “An empirical bayes model for gene expression and methylation profiles in antiestrogen resistant breast cancer,” BMC Med. Genomics, 3, 55.; Jeong, J.; Li, L.; Liu, Y.; Nephew, K. P.; Huang, T. H.-M.; Shen, C., An empirical bayes model for gene expression and methylation profiles in antiestrogen resistant breast cancer, BMC Med enomics, 3, 55 (2010)
[18] Jjingo, D., A. B. Conley, V. Y. Soojin, V. V. Lunyak and I. K. Jordan (2012): “On the presence and role of human gene-body DNA methylation,” Oncotarget, 3, 462-474.; Jjingo, D.; Conley, A. B.; Soojin, V. Y.; Lunyak, V. V.; Jordan, I. K., On the presence and role of human gene-body DNA methylation, Oncotarget, 3, 462-474 (2012)
[19] Jones, P. A. (2012): “Functions of DNA methylation: islands, start sites, gene bodies and beyond,” Nat. Rev. Genet., 13, 484.; Jones, P. A., Functions of DNA methylation: islands, start sites, gene bodies and beyond, Nat. Rev. Genet., 13, 484 (2012)
[20] Jones, P. A. and S. B. Baylin (2007): “The epigenomics of cancer,” Cell, 128, 683-692.; Jones, P. A.; Baylin, S. B., The epigenomics of cancer, Cell, 128, 683-692 (2007)
[21] Karatzoglou, A., A. Smola and K. Hornik (2013): “Kernlab: Kernel-based machine learning lab. Eumetopias ju-batus) distributions and their environment,” J. Theor. Biol., 1-10.; Karatzoglou, A.; Smola, A.; Hornik, K., Kernlab: Kernel-based machine learning lab. Eumetopias ju-batus) distributions and their environment, J. Theor. Biol., 1-10 (2013)
[22] Kass, S. U., N. Landsberger and A. P. Wolffe (1997): “DNA methylation directs a time-dependent repression of transcription initiation,” Curr. Biol., 7, 157-165.; Kass, S. U.; Landsberger, N.; Wolffe, A. P., DNA methylation directs a time-dependent repression of transcription initiation, Curr. Biol., 7, 157-165 (1997)
[23] Kristensen, V. N., O. C. Lingjærde, H. G. Russnes, H. K. M. Vollan, A. Frigessi and A.-L. Børresen-Dale (2014): “Principles and methods of integrative genomic analyses in cancer,” Nat. Rev. Cancer, 14, 299-313.; Kristensen, V. N.; Lingjærde, O. C.; Russnes, H. G.; Vollan, H. K. M.; Frigessi, A.; Børresen-Dale, A.-L., Principles and methods of integrative genomic analyses in cancer, Nat. Rev ancer, 14, 299-313 (2014)
[24] Kuhn, M. (2015): “Caret: classification and regression training, Astrophysics Source Code Library”.; Kuhn, M., Caret: classification and regression training, Astrophysics Source Code Library (2015)
[25] Levenson, V. V. (2010): “DNA methylation as a universal biomarker,” Expert. Rev. Mol. Diagn., 10, 481-488.; Levenson, V. V., DNA methylation as a universal biomarker, Expert. Rev. Mol. Diagn., 10, 481-488 (2010)
[26] List, M., A.-C. Hauschild, Q. Tan, T. A. Kruse, J. Baumbach and R. Batra (2014): Classification of breast cancer subtypes by combining gene expression and DNA methylation data,“ J. Integr. Bioinform., 11, 1-14.<pub-id pub-id-type=”doi“>10.1515/jib-2014-236; <element-citation publication-type=”journal“ publication-format=”print”> List, M.Hauschild, A.-C.Tan, Q.Kruse, T. A.Baumbach, J.Batra, R.2014Classification of breast cancer subtypes by combining gene expression and DNA methylation dataJ. Integr. Bioinform11114
[27] Love, M. I., W. Huber and S. Anders (2014): “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2,” Genome Biol., 15, 550.; Love, M. I.; Huber, W.; Anders, S., Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2, Genome Biol., 15, 550 (2014)
[28] Ma, K., B. Cao and M. Guo (2016): “The detective, prognostic, and predictive value of DNA methylation in human esophageal squamous cell carcinoma,” Clin. Epigenetics, 8, 43.; Ma, K.; Cao, B.; Guo, M., The detective, prognostic, and predictive value of DNA methylation in human esophageal squamous cell carcinoma, Clin pigenetics, 8, 43 (2016)
[29] McCarthy, D. J., Y. Chen and G. K. Smyth (2012): “Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation,” Nucleic Acids Res., 40, 4288-4297.; McCarthy, D. J.; Chen, Y.; Smyth, G. K., Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation, Nucleic Acids Res., 40, 4288-4297 (2012)
[30] Mendizabal, I., J. Zeng, T. E. Keller and S. V. Yi (2017): “Body-hypomethylated human genes harbor extensive intragenic transcriptional activity and are prone to cancer-associated dysregulation,” Nucleic Acids Res., 45, 4390-4400.; Mendizabal, I.; Zeng, J.; Keller, T. E.; Yi, S. V., Body-hypomethylated human genes harbor extensive intragenic transcriptional activity and are prone to cancer-associated dysregulation, Nucleic Acids Res., 45, 4390-4400 (2017)
[31] Meyer, D., E. Dimitriadou, K. Hornik, A. Weingessel and F. Leisch (2016): e1071: Misc functions of the department of statistics, probability theory group (formerly: E1071), tu wien, 2015, R package version, p. 1-6.; Meyer, D.; Dimitriadou, E.; Hornik, K.; Weingessel, A.; Leisch, F., e1071: Misc functions of the department of statistics, probability theory group (formerly: E1071), tu wien, 2015, R package version, 1-6 (2016)
[32] Morris, T. J., L. M. Butcher, A. Feber, A. E. Teschendorff, A. R. Chakravarthy, T. K. Wojdacz and S. Beck (2013): “ChAMP: 450k chip analysis methylation pipeline,” Bioinformatics, 30, 428-430.; Morris, T. J.; Butcher, L. M.; Feber, A.; Teschendorff, A. E.; Chakravarthy, A. R.; Wojdacz, T. K.; Beck, S., ChAMP: 450k chip analysis methylation pipeline, Bioinformatics, 30, 428-430 (2013)
[33] R Core Team (2017): R: A language and environment for statistical computing, R foundation for statistical computing, Vienna, Austria.; R: A language and environment for statistical computing (2017)
[34] Ritchie, M. E., B. Phipson, D. Wu, Y. Hu, C. W. Law, W. Shi and G. K. Smyth (2015): “limma powers differential expression analyses for RNA-sequencing and microarray studies,” Nucleic Acids Res., 43, e47.; Ritchie, M. E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C. W.; Shi, W.; Smyth, G. K., limma powers differential expression analyses for RNA-sequencing and microarray studies, Nucleic Acids Res., 43, e47 (2015)
[35] Scott, W. D. (2008): Multivariate density estimation: theory, practice, and visualization, John Wiley & Sons, Inc., Hoboken, NJ, USA.; Scott, W. D., Multivariate density estimation: theory, practice, and visualization (2008) · Zbl 0850.62006
[36] Smyth, Gordon K. (2004): “Linear models and empirical bayes methods for assessing differential expression in microarray experiments,” Stat. Appl. Genet. Mol. Biol., 3, 1-25.; Smyth, Gordon K., Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments, Statistical Applications in Genetics and Molecular Biology, 3, 1, 1-25 (2004) · Zbl 1038.62110
[37] Smith, Z. D. and A. Meissner (2013): “DNA methylation: roles in mammalian development,” Nat. Rev. Genet., 14, 204-220.; Smith, Z. D.; Meissner, A., DNA methylation: roles in mammalian development, Nat. Rev. Genet., 14, 204-220 (2013)
[38] Smith, A. D., D. Roda and T. A. Yap (2014): “Strategies for modern biomarker and drug development in oncology,” J. Hematol. Oncol., 7, 70.; Smith, A. D.; Roda, D.; Yap, T. A., Strategies for modern biomarker and drug development in oncology, J. Hematol. Oncol., 7, 70 (2014)
[39] Strand, S. H., T. F. Orntoft and K. D. Sorensen (2014): “Prognostic DNA methylation markers for prostate cancer,” Int. J. Mol. Sci., 15, 16544-16576.; Strand, S. H.; Orntoft, T. F.; Sorensen, K. D., Prognostic DNA methylation markers for prostate cancer, Int. J. Mol. Sci., 15, 16544-16576 (2014)
[40] Świtnicki, M. P., M. Juul, T. Madsen, K. D. Sørensen and J. S. Pedersen (2016): “PINCAGE: probabilistic integration of cancer genomics data for perturbed gene identification and sample classification,” Bioinformatics, 32, 1353-1365.; Świtnicki, M. P.; Juul, M.; Madsen, T.; Sørensen, K. D.; Pedersen, J. S., PINCAGE: probabilistic integration of cancer genomics data for perturbed gene identification and sample classification, Bioinformatics, 32, 1353-1365 (2016)
[41] Weinstein, J. N., E. A. Collisson, G. B. Mills, K. R. M. Shaw, B. A. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander and J. M. Stuart (2013): “The cancer genome atlas pan-cancer analysis project,” Nat. Genet., 45, 1113-1120.; Weinstein, J. N.; Collisson, E. A.; Mills, G. B.; Shaw, K. R. M.; Ozenberger, B. A.; Ellrott, K.; Shmulevich, I.; Sander, C.; Stuart, J. M., The cancer genome atlas pan-cancer analysis project, Nat. Genet., 45, 1113-1120 (2013)
[42] Wu, D., J. Gu and M. Q. Zhang (2013): “FastDMA: an infinium humanmethylation450 beadchip analyzer,” PloS One, 8, e74275.; Wu, D.; Gu, J.; Zhang, M. Q., FastDMA: an infinium humanmethylation450 beadchip analyzer, PloS One, 8 (2013)
[43] Yang, X., H. Han, D. D. De Carvalho, F. D. Lay, P. A. Jones and G. Liang (2014): “Gene body methylation can alter gene expression and is a therapeutic target in cancer,” Cancer Cell, 26, 577-590.; Yang, X.; Han, H.; De Carvalho, D. D.; Lay, F. D.; Jones, P. A.; Liang, G., Gene body methylation can alter gene expression and is a therapeutic target in cancer, Cancer Cell, 26, 577-590 (2014)
[44] Zhong, D. and H. Cen (2017): “Aberrant promoter methylation profiles and association with survival in patients with hepatocellular carcinoma,” OncoTargets Ther., 10, 2501.; Zhong, D.; Cen, H., Aberrant promoter methylation profiles and association with survival in patients with hepatocellular carcinoma, OncoTargets Ther., 10, 2501 (2017)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.