Zemplenyi, Michele; Meyer, Mark J.; Cardenas, Andres; Hivert, Marie-France; Rifas-Shiman, Sheryl L.; Gibson, Heike; Kloog, Itai; Schwartz, Joel; Oken, Emily; Demeo, Dawn L.; Gold, Diane R.; Coull, Brent A. Function-on-function regression for the identification of epigenetic regions exhibiting windows of susceptibility to environmental exposures. (English) Zbl 1478.62358 Ann. Appl. Stat. 15, No. 3, 1366-1385 (2021). Summary: The ability to identify time periods when individuals are most susceptible to exposures as well as the biological mechanisms through which these exposures act is of great public health interest. Growing evidence supports an association between prenatal exposure to air pollution and epigenetic marks, such as DNA methylation, but the timing and gene-specific effects of these epigenetic changes are not well understood. Here, we present the first study that aims to identify prenatal windows of susceptibility to air pollution exposures in cord blood DNA methylation. In particular, we propose a function-on-function regression model that leverages data from nearby DNA methylation probes to identify epigenetic regions that exhibit windows of susceptibility to ambient particulate matter less than 2.5 microns \((\mathrm{PM}_{2.5})\). By incorporating the covariance structure among both the multivariate DNA methylation outcome and the time-varying exposure under study, this framework yields greater power to detect windows of susceptibility and greater control of false discoveries than methods that model probes independently. We compare our method to a distributed lag model approach that models DNA methylation in a probe-by-probe manner, both in simulation and by application to motivating data from the Project Viva birth cohort. We identify a window of susceptibility to \(\mathrm{PM}_{2.5}\) exposure in the middle of the third trimester of pregnancy in an epigenetic region selected based on prior studies of air pollution effects on epigenome-wide methylation. MSC: 62P12 Applications of statistics to environmental and related topics 62R10 Functional data analysis Keywords:functional data analysis; wavelet regression; windows of susceptibility; epigenetics Software:fda (R); WaveSeq; pffr × Cite Format Result Cite Review PDF Full Text: DOI arXiv References: [1] Baccarelli, A., Wright, R. O., Bollati, V., Tarantini, L., Litonjua, A. A., Suh, H. H., Zanobetti, A., Sparrow, D., Vokonas, P. S. et al. (2009). Rapid DNA methylation changes after exposure to traffic particles. Am. J. Respir. Crit. Care Med. 179. [2] Bollati, V., Tarantini, L., Hu, H., Schwartz, J. D., Wright, R. J., Park, S. K., Sparrow, D., Vokonas, P. S., Baccarelli, A. et al. (2010). Biomarkers of lead exposure and DNA methylation within retrotransposons. Environ. Health Perspect. 118. [3] Bose, S., Chiu, Y.-H. M., Hsu, H.-H. L., Di, Q., Rosa, M. J., Lee, A., Kloog, I., Wilson, A., Schwartz, J. et al. (2017). Prenatal nitrate exposure and childhood asthma. Influence of maternal prenatal stress and fetal sex. Am. J. Respir. Crit. Care Med. 196. [4] Bose, S., Rosa, M. J., Mathilda Chiu, Y.-H., Leon Hsu, H.-H., Di, Q., Lee, A., Kloog, I., Wilson, A., Schwartz, J. et al. (2018). Prenatal nitrate air pollution exposure and reduced child lung function: Timing and fetal sex effects. Environ. Res. 167 591-597. [5] Bose, S., Ross, K. R., Rosa, M. J., Chiu, Y.-H. M., Just, A., Kloog, I., Wilson, A., Thompson, J., Svensson, K. et al. (2019). Prenatal particulate air pollution exposure and sleep disruption in preschoolers: Windows of susceptibility. Environ. Int. 124. [6] Breton, C., Marsit, C., Faustman, E., Nadeau, K., Goodrich, J., Dolinoy, D., Herbstman, J., Holland, N., Lasalle, J. et al. (2017). Small-magnitude effect sizes in epigenetic end points are important in children’s environmental health studies: The children’s environmental health and disease prevention research center’s epigenetics working group. Environ. Health Perspect. 125 511-526. [7] Cederbaum, J., Pouplier, M., Hoole, P. and Greven, S. (2016). Functional linear mixed models for irregularly or sparsely sampled data. Stat. Model. 16 67-88. · Zbl 07259010 · doi:10.1177/1471082X15617594 [8] Chiu, Y.-H. M., Hsu, H.-H. L., Coull, B. A., Bellinger, D. C., Kloog, I., Schwartz, J., Wright, R. O. and Wright, R. J. (2016). Prenatal particulate air pollution and neurodevelopment in urban children: Examining sensitive windows and sex-specific associations. Environ. Int. 87 56-65. [9] Clement, L., De Beuf, K., Thas, O., Vuylsteke, M., Irizarry, R. A. and Crainiceanu, C. M. (2012). Fast wavelet based functional models for transcriptome analysis with tiling arrays. Stat. Appl. Genet. Mol. Biol. 11 Art. 4, 38. · Zbl 1296.92024 · doi:10.2202/1544-6115.1726 [10] Dadvand, P., Parker, J., Bell, M. L., Bonzini, M., Brauer, M., Darrow, L. A., Gehring, U., Glinianaia, S. V., Gouveia, N. et al. (2013). Maternal exposure to particulate air pollution and term birth weight: A multi-country evaluation of effect and heterogeneity. Environ. Health Perspect. 121 267-373. · doi:10.1289/ehp.1205575 [11] Darrow, L. A., Klein, M., Strickland, M. J., Mulholland, J. A. and Tolbert, P. E. (2011). Ambient air pollution and birth weight in full-term infants in Atlanta, 1994-2004. Environ. Health Perspect. 119 731-737. · doi:10.1289/ehp.1002785 [12] Du, P., Zhang, X., Huang, C.-C., Jafari, N., Kibbe, W., Hou, L. and Lin, S. (2010). Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinform. 11 587. [13] Fernández, L., Orduã, L., Pérez, M. and Orduã, J. M. (2020). A new approach for the visualization of DNA methylation results. Comput. Math. Methods 2 e1043, 6. · doi:10.1002/cmm4.1043 [14] Ferraty, F., Van Keilegom, I. and Vieu, P. (2012). Regression when both response and predictor are functions. J. Multivariate Anal. 109 10-28. · Zbl 1241.62054 · doi:10.1016/j.jmva.2012.02.008 [15] Ferraty, F., Laksaci, A., Tadj, A. and Vieu, P. (2011). Kernel regression with functional response. Electron. J. Stat. 5 159-171. · Zbl 1274.62281 · doi:10.1214/11-EJS600 [16] Fleisch, A., Rifas-Shiman, S., Koutrakis, P., Schwartz, J., Kloog, I., Melly, S., Coull, B., Zanobetti, A., Gillman, M. et al. (2015). Prenatal exposure to traffic pollution: Associations with reduced fetal growth and rapid infant weight gain. Epidemiology 26 43-50. [17] Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In Bayesian Statistics, 4 (Peñíscola, 1991) 169-193. Oxford Univ. Press, New York. [18] Goldsmith, J., Bobb, J., Crainiceanu, C. M., Caffo, B. and Reich, D. (2011). Penalized functional regression. J. Comput. Graph. Statist. 20 830-851. · doi:10.1198/jcgs.2010.10007 [19] Greven, S. and Scheipl, F. (2017). A general framework for functional regression modelling. Stat. Model. 17 1-35. · Zbl 07289474 · doi:10.1177/1471082X16681317 [20] Gruzieva, O., Kogevinas, M., Ruiz, J. L., Bustamante Pineda, M., Antó I Boqué, J. M., Sunyer Deu, J., Vrijheid, M., Hernandez Ferre, C. and Melén, E. (2019). Prenatal particulate air pollution and DNA methylation in newborns: An epigenome-wide meta-analysis. Environ. Health Perspect. 127. [21] Guo, S., Diep, D., Plongthongkum, N., Fung, H.-L., Zhang, K. and Zhang, K. (2017). Identification of methylation haplotype blocks aids in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA. Nat. Genet. 49 635-642. [22] Hancock, D. B., Eijgelsheim, M., Wilk, J. B., Gharib, S. A., Loehr, L. R., Marciante, K. D., Franceschini, N., Durme, Y. M. T. A. V., Chen, T.-H. et al. (2009). Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function. Nat. Genet. 42 45. [23] Harris, M. H., Gold, D. R., Rifas-Shiman, S. L., Melly, S. J., Zanobetti, A., Coull, B. A., Schwartz, J. D., Gryparis, A., Kloog, I. et al. (2016). Prenatal and childhood traffic-related air pollution exposure and childhood executive function and behavior. Neurotoxicol. Teratol. 57 60-70. · doi:10.1016/j.ntt.2016.06.008 [24] Hill, M. (2019). Embryology fetal development. Available at https://embryology.med.unsw.edu.au/embryology/index.php/Fetal_Development, Last accessed on 2019-10-11. [25] Hobbs, B. D., Jong, K. D., Lamontagne, M., Bossé, Y., Shrine, N., Artigas, M. S., Wain, L. V., Hall, I. P., Jackson, V. E. et al. (2017). Genetic loci associated with chronic obstructive pulmonary disease overlap with loci for lung function and pulmonary fibrosis. Nat. Genet. 49. [26] Hsu, L., Self, S., Grove, D., Randolph, T., Wang, K., Delrow, J., Loo, L. and Porter, P. (2005). Denoising array-based comparative genomic hybridization data using wavelets. Biostatistics 6 211-226. · Zbl 1071.62104 [27] Hsu, H.-H. L., Chiu, Y.-H. M., Coull, B. A., Kloog, I., Schwartz, J., Lee, A., Wright, R. O. and Wright, R. J. (2015). Prenatal particulate air pollution and asthma onset in urban children. Identifying sensitive windows and sex differences. Am. J. Respir. Crit. Care Med. 192. [28] Ivanescu, A. E. (2018). Function-on-function regression for two-dimensional functional data. Comm. Statist. Simulation Comput. 47 2656-2669. · Zbl 07550158 · doi:10.1080/03610918.2017.1353619 [29] Johnstone, I. M. and Silverman, B. W. (1997). Wavelet threshold estimators for data with correlated noise. J. Roy. Statist. Soc. Ser. B 59 319-351. · Zbl 0886.62044 · doi:10.1111/1467-9868.00071 [30] Kloog, I., Koutrakis, P., Coull, B., Lee, H. and Schwartz, J. (2011). Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements. Atmos. Environ. 45 6267-6275. [31] Kloog, I., Chudnovsky, A. A., Just, A. C., Nordio, F., Koutrakis, P., Coull, B. A., Lyapustin, A., Wang, Y. and Schwartz, J. (2014). A new hybrid spatio-temporal model for estimating daily multi-year PM2.5 concentrations across northeastern USA using high resolution aerosol optical depth data. Atmos. Environ. 95 581-590. [32] Lamichhane, D. K., Ryu, J., Leem, J.-H., Ha, M., Hong, Y.-C., Park, H., Kim, Y., Jung, D.-Y., Lee, J.-Y. et al. (2018). Air pollution exposure during pregnancy and ultrasound and birth measures of fetal growth: A prospective cohort study in Korea. Sci. Total Environ. 619-620 834-841. [33] Lavigne, E., Donelle, J., Hatzopoulou, M., Van Ryswyk, K., van Donkelaar, A., Martin, R. V., Chen, H., Stieb, D. M., Gasparrini, A. et al. (2019). Spatiotemporal variations in ambient ultrafine particles and the incidence of childhood asthma. Am. J. Respir. Crit. Care Med. 199. [34] Lee, W. and Morris, J. S. (2016). Identification of differentially methylated loci using wavelet-based functional mixed models. Bioinformatics 32 664-672. [35] Lee, K. H., Tadesse, M. G., Baccarelli, A. A., Schwartz, J. and Coull, B. A. (2017). Multivariate Bayesian variable selection exploiting dependence structure among outcomes: Application to air pollution effects on DNA methylation. Biometrics 73 232-241. · Zbl 1366.62228 · doi:10.1111/biom.12557 [36] Lee, A., Legrand, B., Hsu, H., Chiu, Y., Brennan, K., Bose, S., Rosa, M., Kloog, I., Wilson, A. et al. (2018). Prenatal fine particulate exposure associated with reduced childhood lung function and nasal epithelia GSTP1 hypermethylation: Sex-specific effects. Am. J. Respir. Crit. Care Med. 197. [37] Lee, W., Miranda, M. F., Rausch, P., Baladandayuthapani, V., Fazio, M., Downs, J. C. and Morris, J. S. (2019). Bayesian semiparametric functional mixed models for serially correlated functional data with application to Glaucoma data. J. Amer. Statist. Assoc. 114 495-513. · Zbl 1420.62456 · doi:10.1080/01621459.2018.1476242 [38] Lepeule, J., Baccarelli, A., Tarantini, L., Motta, V., Cantone, L., Litonjua, A. A., Sparrow, D., Vokonas, P. S. and Schwartz, J. (2012). Gene promoter methylation is associated with lung function in the elderly: The normative aging study. Epigenetics 7 261-269. [39] Li, X., Hawkins, G. A., Ampleford, E. J., Moore, W. C., Li, H., Hastie, A. T., Howard, T. D., Boushey, H. A., Busse, W. W. et al. (2013). Genome-wide association study identifies TH1 pathway genes associated with lung function in asthmatic patients. The Journal of Allergy and Clinical Immunology 132 313-320. [40] Malfait, N. and Ramsay, J. O. (2003). The historical functional linear model. Canad. J. Statist. 31 115-128. · Zbl 1039.62035 · doi:10.2307/3316063 [41] Malloy, E. J., Morris, J. S., Adar, S. D., Suh, H., Gold, D. R. and Coull, B. A. (2010). Wavelet-based functional linear mixed models: An application to measurement error-corrected distributed lag models. Biostatistics 11 432-452. · Zbl 1437.62551 [42] Meyer, M. J., Coull, B. A., Versace, F., Cinciripini, P. and Morris, J. S. (2015). Bayesian function-on-function regression for multilevel functional data. Biometrics 71 563-574. · Zbl 1419.62408 · doi:10.1111/biom.12299 [43] Mitra, A. and Song, J. (2012). WaveSeq: A novel data-driven method of detecting histone modification enrichments using wavelets (ChIP-seq and wavelets) 7. [44] Morris, J. S. (2015). Functional regression. Annual Reviews of Statistics and Its Application 2 321-359. [45] Morris, J. S. (2017). Comparison and contrast of two general functional regression modelling frameworks [Discussion of MR3619335]. Stat. Model. 17 59-85. · Zbl 07289478 · doi:10.1177/1471082X16681875 [46] Morris, J. S. and Carroll, R. J. (2006). Wavelet-based functional mixed models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 179-199. · Zbl 1110.62053 · doi:10.1111/j.1467-9868.2006.00539.x [47] Morris, J. S., Brown, P. J., Herrick, R. C., Baggerly, K. A. and Coombes, K. R. (2008). Bayesian analysis of mass spectrometry proteomic data using wavelet-based functional mixed models. Biometrics 64 479-489, 667. · Zbl 1137.62399 · doi:10.1111/j.1541-0420.2007.00895.x [48] Müller, P., Parmigiani, G., Rice, V. C., Fernández-Val, I. and Kowalski, A. (2006). FDR and Bayesian multiple comparison rules. Working paper. [49] Nguyen, N., Vo, A. and Won, K. (2014). A wavelet-based method to exploit epigenomic language in the regulatory region. Bioinformatics 30 908-914. [50] Oken, E., Baccarelli, A. A., Gold, D. R., Kleinman, K. P., Litonjua, A. A., Meo, D. D., Rich-Edwards, J. W., Rifas-Shiman, S. L., Sagiv, S. et al. (2015). Cohort profile: Project viva. Int. J. Epidemiol. 44 37-48. · doi:10.1093/ije/dyu008 [51] Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis. J. Roy. Statist. Soc. Ser. B 53 539-572. · Zbl 0800.62314 [52] Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer Series in Statistics. Springer, New York. · Zbl 1079.62006 [53] Reiss, P. T., Goldsmith, J., Shang, H. L. and Ogden, R. T. (2017). Methods for scalar-on-function regression. Int. Stat. Rev. 85 228-249. · Zbl 07763546 · doi:10.1111/insr.12163 [54] Sardy, S., Percival, D., Bruce, A., Gao, H.-Y. and Stuetzle, W. (1999). Wavelet shrinkage for unequally spaced data. Stat. Comput. 9 65-75. [55] Scheipl, F., Gertheiss, J. and Greven, S. (2016). Generalized functional additive mixed models. Electron. J. Stat. 10 1455-1492. · Zbl 1341.62242 · doi:10.1214/16-EJS1145 [56] Scheipl, F., Staicu, A.-M. and Greven, S. (2015). Functional additive mixed models. J. Comput. Graph. Statist. 24 477-501. · Zbl 1430.62082 · doi:10.1080/10618600.2014.901914 [57] Schneider, J. S., Kidd, S. K. and Anderson, D. W. (2013). Influence of developmental lead exposure on expression of DNA methyltransferases and methyl cytosine-binding proteins in hippocampus. Toxicol Lett 217 75-81. · doi:10.1016/j.toxlet.2012.12.004 [58] Schwartz, J. (2000). The distributed lag between air pollution and daily deaths. Epidemiology 11 320-326. [59] Shah, P. S. and Balkhair, T. (2011). Air pollution and birth outcomes: A systematic review. Environ. Int. 37 498-516. · doi:10.1016/j.envint.2010.10.009 [60] Soberanes, S., Gonzalez, A., Urich, D., Chiarella, S. E., Radigan, K. A., Osornio-Vargas, A., Joseph, J., Kalyanaraman, B., Ridge, K. M. et al. (2012). Particulate matter air pollution induces hypermethylation of the p16 promoter via a mitochondrial ROS-JNK-DNMT1 pathway. Sci. Rep. 2. [61] Sordillo, J. E., Rifas-Shiman, S. L., Switkowski, K., Coull, B., Gibson, H., Rice, M., Platts-Mills, T. A. E., Kloog, I., Litonjua, A. A. et al. (2019). Prenatal oxidative balance and risk of asthma and allergic disease in adolescence. The Journal of Allergy and Clinical Immunology. [62] van Rossem, L., Rifas-Shiman, S. L., Melly, S. J., Kloog, I., Luttmann-Gibson, H., Zanobetti, A., Coull, B. A., Schwartz, J. D., Mittleman, M. A. et al. (2015). Prenatal air pollution exposure and newborn blood pressure. Environ. Health Perspect. 123 353-359. · doi:10.1289/ehp.1307419 [63] Wand, M. P. and Ormerod, J. T. (2011). Penalized wavelets: Embedding wavelets into semiparametric regression. Electron. J. Stat. 5 1654-1717. · Zbl 1271.62089 · doi:10.1214/11-EJS652 [64] Wang, W. (2014). Linear mixed function-on-function regression models. Biometrics 70 794-801. · Zbl 1393.62106 · doi:10.1111/biom.12207 [65] Warren, J. L., Kong, W., Luben, T. J. and Chang, H. H. (2020). Critical window variable selection: Estimating the impact of air pollution on very preterm birth. Biostatistics 21 790-806. · doi:10.1093/biostatistics/kxz006 [66] Wilson, A., Chiu, Y.-H. M., Hsu, H.-H. L., Wright, R. O., Wright, R. J. and Coull, B. A. (2017). Bayesian distributed lag interaction models to identify perinatal windows of vulnerability in children’s health. Biostatistics 18 537-552. · doi:10.1093/biostatistics/kxx002 [67] Wu, H., Jiang, B., Zhu, P., Geng, X., Liu, Z., Cui, L. and Yang, L. (2018). Associations between maternal weekly air pollutant exposures and low birth weight: A distributed lag non-linear model. Environ. Res. Lett. 13. [68] Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional linear regression analysis for longitudinal data. Ann. Statist. 33 2873-2903. · Zbl 1084.62096 · doi:10.1214/009053605000000660 [69] Zanobetti, A., Wand, M. P., Schwartz, J. and Ryan, L. M. (2000). Generalized additive distributed lag models: Quantifying mortality displacement. Biostatistics 1 279-292. · Zbl 0961.62106 [70] Zemplenyi, M., Meyer, M. J., Cardenas, A., Hivert, M. F., Rifas-Shiman, S. L., Gibson, H., Kloog, I., Schwartz, J., Oken, E. et al. (2021). Supplement to “Function-on-function regression for the identification of epigenetic regions exhibiting windows of susceptibility to environmental exposures.” https://doi.org/10.1214/20-AOAS1425SUPPA, https://doi.org/10.1214/20-AOAS1425SUPPB. [71] Zhang, Y., Shin, H., Song, J. S., Lei, Y. and Liu, X. S. (2008). Identifying positioned nucleosomes with epigenetic marks in human from ChIP-seq. BMC Genomics 9 1-11. [72] Zhu, H., Versace, F., Cinciripini, P. and Morris, J. S. (2018). Robust functional mixed models for spatially correlated functional regression, with application to event-related potentials for nicotine-addicted individuals. NeuroImage 181 501-512 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.