Regularized generalized canonical correlation analysis. (English) Zbl 1284.62753

Summary: Regularized generalized canonical correlation analysis (RGCCA) is a generalization of regularized canonical correlation analysis to three or more sets of variables. It constitutes a general framework for many multi-block data analysis methods. It combines the power of multi-block data analysis methods (maximization of well identified criteria) and the flexibility of PLS path modeling (the researcher decides which blocks are connected and which are not). Searching for a fixed point of the stationary equations related to RGCCA, a new monotonically convergent algorithm, very similar to the PLS algorithm proposed by Herman Wold, is obtained. Finally, a practical example is discussed.


62P15 Applications of statistics to psychology


Full Text: DOI Link


[1] Barker, M., & Rayens, W. (2003). Partial least squares for discrimination. Journal of Chemometrics, 17, 166–173. · doi:10.1002/cem.785
[2] Bougeard, S., Hanafi, M., & Qannari, E.M. (2007). ACPVI multibloc. Application en épidémiologie animale. Journal de la Société Française de Statistique, 148, 77–94.
[3] Bougeard, S., Hanafi, M., & Qannari, E.M. (2008). Continuum redundancy-PLS regression: a simple continuum approach. Computational Statistics & Data Analysis, 52, 3686–3696. · Zbl 1452.62489 · doi:10.1016/j.csda.2007.12.007
[4] Burnham, A.J., Viveros, R., &amp; MacGregor, J.F. (1996). Frameworks for latent variable multivariate regression. Journal of Chemometrics, 10, 31–45. · doi:10.1002/(SICI)1099-128X(199601)10:1<31::AID-CEM398>3.0.CO;2-1
[5] Carroll, J.D. (1968a). A generalization of canonical correlation analysis to three or more sets of variables. In Proc. 76th conv. Am. Psych. Assoc. (pp. 227–228).
[6] Carroll, J.D. (1968b). Equations and tables for a generalization of canonical correlation analysis to three or more sets of variables. Unpublished companion paper to Carroll, J.D. (1968a).
[7] Chessel, D., &amp; Hanafi, M. (1996). Analyse de la co-inertie de K nuages de points. Revue de Statistique Appliquée, 44, 35–60.
[8] Chu, M.T., &amp; Watterson, J.L. (1993). On a multivariate eigenvalue problem: I. Algebraic theory and power method. SIAM Journal on Scientific and Statistical Computing, 14, 1089–1106. · Zbl 0789.65023 · doi:10.1137/0914066
[9] Dahl, T., &amp; Næs, T. (2006). A bridge between Tucker-1 and Carroll’s generalized canonical analysis. Computational Statistics &amp; Data Analysis, 50, 3086–3098. · Zbl 1445.62131 · doi:10.1016/j.csda.2005.06.016
[10] Fornell, C., &amp; Bookstein, F.L. (1982). Two structural equation models: LISREL and PLS applied to consumer exit-voice theory. Journal of Marketing Research, 19, 440–452. · doi:10.2307/3151718
[11] Gifi, A. (1990). Nonlinear multivariate analysis. Chichester: Wiley. · Zbl 0697.62048
[12] Hanafi, M. (2007). PLS Path modelling: computation of latent variables with the estimation mode B. Computational Statistics, 22, 275–292. · Zbl 1196.62103 · doi:10.1007/s00180-007-0042-3
[13] Hanafi, M., &amp; Kiers, H.A.L. (2006). Analysis of K sets of data, with differential emphasis on agreement between and within sets. Computational Statistics &amp; Data Analysis, 51, 1491–1508. · Zbl 1157.62422 · doi:10.1016/j.csda.2006.04.020
[14] Hanafi, M., &amp; Lafosse, R. (2001). Généralisations de la régression simple pour analyser la dépendance de K ensembles de variables avec un K+1ème. Revue de Statistique Appliquée, 49, 5–30.
[15] Horst, P. (1961). Relations among m sets of variables. Psychometrika, 26, 126–149. · Zbl 0099.35801 · doi:10.1007/BF02289710
[16] Jöreskog, K.G. (1970). A general method for the analysis of covariance structure. Biometrika, 57, 239–251. · Zbl 0195.48801
[17] Kettenring, J.R. (1971). Canonical analysis of several sets of variables. Biometrika, 58, 433–451. · Zbl 0225.62072 · doi:10.1093/biomet/58.3.433
[18] Krämer, N. (2007). Analysis of high-dimensional data with partial least squares and boosting. Doctoral dissertation, Technischen Universität Berlin.
[19] Ledoit, O., &amp; Wolf, M. (2004). A well conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88, 365–411. · Zbl 1032.62050 · doi:10.1016/S0047-259X(03)00096-4
[20] Leurgans, S.E., Moyeed, R.A., &amp; Silverman, B.W. (1993). Canonical correlation analysis when the data are curves. Journal of the Royal Statistical Society, Series B, 55, 725–740. · Zbl 0803.62049
[21] Lohmöller, J.-B. (1989). Latent variables path modeling with partial least squares. Heildelberg: Physica-Verlag. · Zbl 0788.62050
[22] Noonan, R., &amp; Wold, H. (1982). PLS path modeling with indirectly observed variables: a comparison of alternative estimates for the latent variable. In K.G. Jöreskog &amp; H. Wold (Eds.), Systems under indirect observation, Part 2 (pp. 75–94). Amsterdam: North-Holland.
[23] Qannari, E.M., &amp; Hanafi, M. (2005). A simple continuum regression approach. Journal of Chemometrics, 19, 387–392. · doi:10.1002/cem.942
[24] R Development Core Team (2009). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. http://www.R-project.org .
[25] Russett, B.M. (1964). Inequality and instability: the relation of land tenure to politics. World Politics, 16, 442–454. · doi:10.2307/2009581
[26] Schäfer, J., &amp; Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4(1), Article 32.
[27] Shawe-Taylor, J., &amp; Cristianini, N. (2004). Kernel methods for pattern analysis. New York: Cambridge University Press. · Zbl 0994.68074
[28] Takane, Y., &amp; Hwang, H. (2007). Regularized linear and kernel redundancy analysis. Computational Statistics &amp; Data Analysis, 52, 394–405. · Zbl 1452.62421 · doi:10.1016/j.csda.2007.02.014
[29] Takane, Y., Hwang, H., &amp; Abdi, H. (2008). Regularized multiple-set canonical correlation analysis. Psychometrika, 73, 753–775. · Zbl 1284.62750 · doi:10.1007/s11336-008-9065-0
[30] Ten Berge, J.M.F. (1988). Generalized approaches to the MAXBET problem and the MAXDIFF problem, with applications to canonical correlations. Psychometrika, 53, 487–494. · Zbl 0726.62086 · doi:10.1007/BF02294402
[31] Tenenhaus, A. (2010). Kernel generalized canonical correlation analysis. In 42ièmes journées de statistique (JdS’10), Marseille, France, May 24–28.
[32] Tenenhaus, M. (2008). Component-based structural equation modelling. Total Quality Management &amp; Business Excellence, 19, 871–886. · doi:10.1080/14783360802159543
[33] Tenenhaus, M., Esposito Vinzi, V., Chatelin, Y.-M., &amp; Lauro, C. (2005). PLS path modeling. Computational Statistics &amp; Data Analysis, 48, 159–205. · Zbl 1429.62227 · doi:10.1016/j.csda.2004.03.005
[34] Tenenhaus, M., &amp; Hanafi, M. (2010). A bridge between PLS path modeling and multi-block data analysis. In V. Esposito Vinzi, J. Henseler, W. Chin, &amp; H. Wang (Eds.), Handbook of partial least squares (PLS): concepts, methods and applications (pp. 99–123). Berlin: Springer.
[35] Tucker, L.R. (1958). An inter-battery method of factor analysis. Psychometrika, 23, 111–136. · Zbl 0097.35102 · doi:10.1007/BF02289009
[36] Van de Geer, J.P. (1984). Linear relations among k sets of variables. Psychometrika, 4(9), 70–94.
[37] Vinod, H.D. (1976). Canonical ridge and econometrics of joint production. Journal of Econometrics, 4, 147–166. · Zbl 0331.62079 · doi:10.1016/0304-4076(76)90010-5
[38] Vivien, M., &amp; Sabatier, R. (2003). Generalized orthogonal multiple co-inertia analysis (-PLS): new multiblock component and regression methods. Journal of Chemometrics, 17, 287–301. · doi:10.1002/cem.802
[39] Westerhuis, J.A., Kourti, T., &amp; MacGregor, J.F. (1998). Analysis of multiblock and hierarchical PCA and PLS models. Journal of Chemometrics, 12, 301–321. · doi:10.1002/(SICI)1099-128X(199809/10)12:5<301::AID-CEM515>3.0.CO;2-S
[40] Wold, H. (1982). Soft modeling: the basic design and some extensions. In K.G. Jöreskog &amp; H. Wold (Eds.), Systems under indirect observation, Part 2 (pp. 1–54). Amsterdam: North-Holland. · Zbl 0517.62065
[41] Wold, H. (1985). In S. Kotz &amp; N.L. Johnson (Eds.), Encyclopedia of statistical sciences. Partial least squares (Vol. 6, pp. 581–591). New York: Wiley.
[42] Wold, S., Martens, H., &amp; Wold, H. (1983). The multivariate calibration problem in chemistry solved by the PLS method. In A. Ruhe &amp; B. Kåstrøm (Eds.), Lecture notes in mathematics. Proc. conf. matrix pencils, March 1982, (pp. 286–293). Heidelberg: Springer. · Zbl 0499.65065
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.