×

Imaging genetics with partial least squares for mixed-data types (mimopls). (English) Zbl 1366.62204

Abdi, Hervé (ed.) et al., The multiple facets of partial least squares methods. PLS, Paris, France, May 26–28, 2014. Cham: Springer (ISBN 978-3-319-40641-1/hbk; 978-3-319-40643-5/ebook). Springer Proceedings in Mathematics & Statistics 173, 73-91 (2016).
Summary: “Imaging genetics” studies the genetic contributions to brain structure and function by finding correspondence between genetic data – such as single nucleotide polymorphisms (SNPs) – and neuroimaging data – such as diffusion tensor imaging (DTI). However, genetic and neuroimaging data are heterogenous data types, where neuroimaging data are quantitative and genetic data are (usually) categorical. So far, methods used in imaging genetics treat all data as quantitative, and this sometimes requires unrealistic assumptions about the nature of genetic data. In this article we present a new formulation of Partial Least Squares Correlation (PLSC) – called Mixed-modality Partial Least Squares (MiMoPLS) – specifically tailored for heterogeneous (mixed-) data types. MiMoPLS integrates features of PLSC and Correspondence Analysis (CA) by using special properties of quantitative data and Multiple Correspondence Analysis (MCA). We illustrate MiMoPLS with an example data set from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) with DTI and SNPs.
For the entire collection see [Zbl 1356.62003].

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62J05 Linear regression; mixed models
62H20 Measures of association (correlation, canonical correlation, etc.)
62H25 Factor analysis and principal components; correspondence analysis
92D10 Genetics and epigenetics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abdi, H.: Singular value decomposition (SVD) and generalized singular value decomposition (GSVD). In: Salkind, N. (ed.) Encyclopedia of Measurement and Statistics, pp. 907–912. Sage, Thousand Oaks (2007)
[2] Abdi, H., Béra, M.: Correspondence analysis. In: Alhajj, R., Rokne, J. (eds.) Encyclopedia of Social Networks and Mining, pp. 275–284. Springer, New York (2014)
[3] Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev.: Comput. Stat. 2, 433–459 (2010a) · doi:10.1002/wics.101
[4] Abdi, H., Williams, L.J.: Correspondence analysis. In: Salkind, N. (ed.) Encyclopedia of Research Design, pp. 267–278. Sage, Thousand Oaks (2010b)
[5] Abdi, H., Williams, L.J.: Partial least squares methods: partial least squares correlation and partial least square regression. In: Reisfeld, B., Mayeno, A. (eds.) Methods in Molecular Biology: Computational Toxicology, pp. 549–579. Springer, New York (2013)
[6] Allen, G.I.: Sparse and Functional Principal Components Analysis (2013). arXiv preprint arXiv:1309.2895
[7] Beaton, D., Filbey, F.M., Abdi, H.: Integrating partial least squares correlation and correspondence analysis for nominal data. In: Abdi, H., Chin, W.W., Esposito Vinzi, V., Russolillo, G., Trinchera, L. (eds.) New Perspectives in Partial Least Squares and Related Methods, pp. 81–94. Springer, New York (2013) · doi:10.1007/978-1-4614-8283-3_4
[8] Beaton, D., Dunlop, J., ADNI, Abdi, H.: Partial least squares-correspondence analysis: a framework to simultaneously analyze behavioral and genetic data. Psychol. Methods 20 (2016, in press)
[9] Bécue-Bertaut, M., Pagès, J.: Multiple factor analysis and clustering of a mixture of quantitative, categorical and frequency data. Computat. Stat. Data Anal. 52, 3255–3268 (2008) · Zbl 1452.62406 · doi:10.1016/j.csda.2007.09.023
[10] Bertram, L., McQueen, M.B., Mullin, K., Blacker, D., Tanzi, R.E.: Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database. Nat. Genet. 39, 17–23 (2007) · doi:10.1038/ng1934
[11] Bookstein, F.: Partial least squares: a dose–response model for measurement in the behavioral and brain sciences. Psycoloquy 5 (23), 1–10 (1994)
[12] Bretherton, C.S., Smith, C., Wallace, J.M.: An intercomparison of methods for finding coupled patterns in climate data. J. Clim. 5, 541–560 (1992) · doi:10.1175/1520-0442(1992)005<0541:AIOMFF>2.0.CO;2
[13] Cantor, R.M., Lange, K., Sinsheimer, J.S.: Prioritizing GWAS results: a review of statistical methods and recommendations for their application. Am. J. Hum. Genet. 86, 6–22 (2010) · doi:10.1016/j.ajhg.2009.11.017
[14] De la Cruz, O., Holmes, S.P.: The duality diagram in data analysis: examples of modern applications. Ann. Appl. Stat. 5, 2266–2277 (2010) · Zbl 1234.62006 · doi:10.1214/10-AOAS408
[15] Dray, S.: Analyzing a pair of tables: co-inertia analysis and duality diagrams. In: Blasius, J., Greenacre, M. (eds.) Visualization and Verbalization of Data, pp. 289–300. CRC Press, London (2014)
[16] Efron, B.: Bootstrap methods: another look at the Jackknife. Ann. Stat. 7, 1–26 (1979) · Zbl 0406.62024 · doi:10.1214/aos/1176344552
[17] Escofier, B.: Traitement simultané de variables qualitatives et quantitatives en analyse factorielle. Les Cahiers de l’Analyse Des Données 4, 137–146 (1979)
[18] Escoufier, Y.: Operators related to a data matrix: a survey. In: Rizzi, A., Vichi, M. (eds.) COMPSTAT: 17th Symposium Proceedings in Computational Statistics, Rome, pp. 285–297. Physica Verlag, New York (2006)
[19] Genin, E., Hannequin, D., Wallon, D., Sleegers, K., Hiltunen, M., Combarros, O., ...Campion, D.: APOE and Alzheimer disease: a major gene with semi-dominant inheritance. Mol. Psychiatry 16, 903–907 (2012)
[20] Greenacre, M.J.: Theory and Applications of Correspondence Analysis. Academic, London (1984) · Zbl 0555.62005
[21] Greenacre, M.: Data doubling and fuzzy coding. In: Blasius, J., Greenacre, M. (eds.) Visualization and Verbalization of Data, pp. 239–253. CRC Press, London (2014)
[22] Hesterberg, T.: Bootstrap. Wiley Interdiscip. Rev.: Comput. Stat. 3, 497–526 (2011) · doi:10.1002/wics.182
[23] Krishnan, A., Williams, L.J., McIntosh, A.R., Abdi, H.: Partial least squares (PLS) methods for neuroimaging: a tutorial and review. NeuroImage 56, 455–475 (2011) · doi:10.1016/j.neuroimage.2010.07.034
[24] Lebart, L., Morineau, A., Warwick, K.M.: Multivariate Descriptive Statistical Analysis: Correspondence Analysis and Related Techniques for Large Matrices. Wiley, New York (1984) · Zbl 0658.62069
[25] Le Floch, E., Guillemot, V., Frouin, V., Pinel, P., Lalanne, C., Trinchera, L., ...Duchesnay, É.: Significant correlation between a set of genetic polymorphisms and a functional brain network revealed by feature selection and sparse partial least squares. NeuroImage 63, 11–24 (2012) · doi:10.1016/j.neuroimage.2012.06.061
[26] Liu, J., Calhoun, V.D.: A review of multivariate analyses in imaging genetics. Front. Neuroinform. 8, 29 (2014)
[27] Liu, J., Pearlson, G., Windemuth, A., Ruano, G., Perrone-Bizzozero, N.I., Calhoun, V.: Combining f MRI and SNP data to investigate connections between brain function and genetics using parallel ICA. Hum. Brain Mapp. 30, 241–255 (2009) · doi:10.1002/hbm.20508
[28] McIntosh, A.R., Bookstein, F.S., Haxby, J., Grady, C.: Spatial pattern analysis of functional brain images using partial least squares. NeuroImage 3, 143–157 (1996) · doi:10.1006/nimg.1996.0016
[29] Meda, S.A., Jagannathan, K., Gelernter, J., Calhoun, V.D., Liu, J., Stevens, M.C., Pearlson, G.D.: A pilot multivariate parallel ICA study to investigate differential linkage between neural networks and genetic profiles in schizophrenia. NeuroImage 53, 1007–1015 (2010) · doi:10.1016/j.neuroimage.2009.11.052
[30] Meyer-Lindenberg, A.: The future of f MRI and genetics research. NeuroImage 62, 1286–1292 (2012) · doi:10.1016/j.neuroimage.2011.10.063
[31] Mitteroecker, P., Cheverud, J.M., Pavlicev, M.: Multivariate analysis of genotype–phenotype association. Genetics 202 (4), 1345–1363 (2016) · doi:10.1534/genetics.115.181339
[32] Oishi, K., Zilles, K., Amunts, K., Faria, A., Jiang, H., Li, X., ...Mori, S.: Human brain white matter atlas: identification and assignment of common anatomical structures in superficial white matter. NeuroImage 43, 447–457 (2008) · doi:10.1016/j.neuroimage.2008.07.009
[33] Sheng, J., Kim, S., Yan, J., Moore, J., Saykin, A., Shen, L.: Data synthesis and method evaluation for brain imaging genetics. In: 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI), Beijing, pp. 1202–1205 (2014) · doi:10.1109/ISBI.2014.6868091
[34] Takane, Y., Hwang, H.: Regularized multiple correspondence analysis. In: Greenacre, M., Blasius, J. (eds.) Multiple Correspondence Analysis and Related Methods, pp. 259–279. Academic, London (2006) · Zbl 1277.62161 · doi:10.1201/9781420011319.ch11
[35] Thompson, P.M., Martin, N.G., Wright, M.J.: Imaging genomics. Curr. Opin. Neurol. 23, 368–373 (2010)
[36] Tishler, A., Dvir, D., Shenhar, A., Lipovetsky, S.: Identifying critical success factors in defense development projects: a multivariate analysis. Technol. Forecast. Soc. Change 51, 151–171 (1996) · doi:10.1016/0040-1625(95)00197-2
[37] Tucker, L.R.: An inter-battery method of factor analysis. Psychometrika 23, 111–136 (1958) · Zbl 0097.35102 · doi:10.1007/BF02289009
[38] Visscher, P.M., Brown, M.A., McCarthy, M.I., Yang, J.: Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012) · doi:10.1016/j.ajhg.2011.11.029
[39] Vounou, M., Nichols, T.E., Montana, G.: Discovering genetic associations with high-dimensional neuroimaging phenotypes: a sparse reduced-rank regression approach. NeuroImage 53, 1147–1159 (2010) · doi:10.1016/j.neuroimage.2010.07.002
[40] Wegelin, J.A.: A survey of partial least squares (PLS) methods, with emphasis on the two-block case. Technical report, University of Washington (2000)
[41] Weiner, M.P., Hudson, T.J.: Introduction to SNPs: discovery of markers for disease. BioTechniques 10 (4–7), 12–13 (2002)
[42] Zapala, M.A., Schork, N.J.: Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables. Proc. Natl. Acad. Sci. 103, 19430–19435 (2006) · doi:10.1073/pnas.0609333103
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.