Michailidis, George; De Leeuw, Jan The Gifi system of descriptive multivariate analysis. (English) Zbl 1059.62551 Stat. Sci. 13, No. 4, 307-336 (1998). Summary: The Gifi system of analyzing categorical data through nonlinear varieties of classical multivariate analysis techniques is reviewed. The system is characterized by the optimal scaling of categorical variables which is implemented through alternating least squares algorithms. The main technique of homogeneity analysis is presented, along with its extensions and generalizations leading to nonmetric principal components analysis and canonical correlation analysis. Several examples are used to illustrate the methods. A brief account of stability issues and areas of applications of the techniques is also given. Cited in 22 Documents MSC: 62H25 Factor analysis and principal components; correspondence analysis 62H20 Measures of association (correlation, canonical correlation, etc.) 62H99 Multivariate analysis Keywords:Optimal scaling; alternating least squares; multivariate techniques; loss functions; stability Software:SPSS; bootstrap × Cite Format Result Cite Review PDF Full Text: DOI References: [1] Anderson, C. S. (1982). The search for school climate: a review of the research. Review of Educational Research 52 368-420. [2] Anderson, T. W. (1984). An Introduction to Multivariate Analy sis Techniques, 2nd ed. Wiley, New York. [3] Benzécri, J. P. (1973). Analy se des Données. Dunod, Paris. · Zbl 0297.62039 [4] Benzécri, J. P. (1992). Handbook of Correspondence Analy sis. Dekker, New York. · Zbl 0766.62034 [5] Bijleveld, C. C. J. H. (1989). Exploratory Linear Dy namic Sy stems Analy sis. DSWO Press, Leiden. [6] Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. [7] . Discrete Multivariate Analy sis: Theory and Practice. MIT Press. [8] Bond, J. and Michailidis, G. (1996). Homogeneity analysis in Lisp-Stat. J. Statistical Software 1. [9] Breiman, L. and Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. J. Amer. Statist. Assoc. 80 580-619. JSTOR: · Zbl 0594.62044 · doi:10.2307/2288473 [10] Buja, A. (1990). Remarks on functional canonical variates, alternating least squares methods and ACE. Ann. Statist. 18 1032-1069. · Zbl 0721.62068 · doi:10.1214/aos/1176347739 [11] Cailliez, F. and Pages, J. P. (1976). Introduction a l’Analy se des Données. SMASH, Paris. [12] Carnegie Foundation for the Advancement of Teach ing (1988). An Imperiled Generation: Saving Urban Schools. Carnegie Foundation, Princeton, NJ. [13] Carroll, J. D. (1968). Generalization of canonical correlation analysis to three or more sets of variables. In Proceedings of the 76th Convention of the American Psy chological Association 3 227-228. [14] Clogg, C. C. (1981). New developments in latent structure analysis. In Factor Analy sis and Measurement in Sociological Research (Jackson and Borgatta, eds.) 215- 246. Sage, Beverly Hills, CA. [15] Clogg, C. C. (1984). Latent structure analysis of a set of multidimensional contingency tables. J. Amer. Statist. Assoc. 79 762-771. JSTOR: · Zbl 0547.62037 · doi:10.2307/2288706 [16] Clogg, C. C. (1986). Statistical modeling versus singular value decomposition. Internat. Statist. Rev. 54 284-288. JSTOR: · Zbl 0594.60014 · doi:10.2307/1403255 [17] de Leeuw, J. (1977). Correctness of Kruskal’s algorithms for monotone regression with ties. Psy chometrika 42 141-144. · Zbl 0352.62067 · doi:10.1007/BF02293750 [18] de Leeuw, J. (1983). On the prehistory of correspondence analysis. Statist. Neerlandica 37 161-164. · Zbl 0546.62034 · doi:10.1111/j.1467-9574.1983.tb00810.x [19] de Leeuw, J. (1984). The Gifi-sy stem of nonlinear multivariate analysis. In Data Analy sis and Informatics III (Diday et al., eds.) 415-424. North-Holland, Amsterdam. [20] de Leeuw, J. (1985). Jackknife and bootstrap methods in multinomial situations. Research Report 85-16, Dept. Data Theory, Leiden Univ. [21] de Leeuw, J. (1988). Models and techniques. Statist. Neerlandica 42 91-98. [22] de Leeuw, J. and van der Burg, E. (1984). The permutational limit distribution of generalized canonical correlations. In Data Analy sis and Informatics IV (Diday et al., eds.) 509-521. North-Holland, Amsterdam. · Zbl 0635.62016 [23] de Leeuw, J. and van Rijckevorsel, J. (1980). Homals and princals. Some generalizations of principal components analysis. In Data Analy sis and Informatics II (Diday et al., eds.) 231-242. North-Holland, Amsterdam. [24] Eades, P. and Sugiy ama, K. (1990). How to draw a directed graph. J. Inform. Process. 13 424-437. · Zbl 0764.68114 [25] Eades, P. and Wormald, N. C. (1994). Edge crossings in drawings of bipartite graphs. Algorithmica 11 379-403. · Zbl 0804.68107 · doi:10.1007/BF01187020 [26] Eades, P., Tamassia, R., di Battista, G. and Tollis, I. [27] . Algorithms for drawing graphs: an annotated bibliography. Comput. Geom. 4 235-282. · Zbl 0804.68001 · doi:10.1016/0925-7721(94)00014-X [28] Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York. · Zbl 0835.62038 [29] Ency clopedia Britannica. [30] Escoufier, Y. (1984). Analy se factorielle en référence a un mod ele: application a l’analyse de tableaux d’échanges. Rev. Statist. Appl. 32 25-36. [31] Escoufier, Y. (1985). L’analyse des correspondences: ses propriétés et ses extensions. Bull. Internat. Statist. Inst. 51 1-16. · Zbl 0646.62050 [32] Escoufier, Y. (1988). Bey ond correspondence analysis. In Classification and Related Methods of Data Analy sis (Bock, ed.). North-Holland, Amsterdam. [33] Fisher, R. A. (1938). The precision of discriminant functions. Annals of Eugenics 10 422-429. · Zbl 0063.01384 [34] Freedman, D. A. and Lane, D. (1983). Significance testing in a nonstochastic setting. In A Festschrift for Erich L. Lehmann (P. J. Bickel, K. A. Doksum and J. L. Hodges, Jr., eds.) 185-208. Wadsworth, Belmont, CA. · Zbl 0523.62041 [35] Gifi, A. (1990). Nonlinear Multivariate Analy sis. Wiley, New York. · Zbl 0697.62048 [36] Gilula, Z. (1986). Grouping and association in contingency tables: an exploratory canonical correlation approach. J. Amer. Statist. Assoc. 81 773-779. JSTOR: · Zbl 0648.62061 · doi:10.2307/2289009 [37] Gilula, Z. and Haberman, S. J. (1986). Canonical analysis of two-way contingency tables by maximum likelihood. J. Amer. Statist. Assoc. 81 780-788. JSTOR: · Zbl 0623.62047 · doi:10.2307/2289010 [38] Gilula, Z. and Ritov, Y. (1990). Inferential ordinal correspondence analysis: motivation, derivation and limitations. Internat. Statist. Rev. 58 99-108. · Zbl 0715.62112 · doi:10.2307/1403461 [39] Gittins, R. (1985). Canonical Analy sis: A Review with Applications in Ecology. Springer, Berlin. · Zbl 0576.62069 [40] Gnanadesikan, R. and Kettenring, J. R. (1984). A pragmatic review of multivariate methods in applications. In Statistics: An Appraisal (H. A. David and H. T. David, eds.) Iowa State Univ. Press. [41] Golub, G. H. and van Loan, C. F. (1989). Matrix Computations. Johns Hopkins Univ. Press. · Zbl 0733.65016 [42] Goodman, L. A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61 215-231. JSTOR: · Zbl 0281.62057 · doi:10.1093/biomet/61.2.215 [43] Goodman, L. A. (1979). Simple models for the analysis of association in cross-classifications having ordered categories. J. Amer. Statist. Assoc. 74 537-552. · doi:10.2307/2286971 [44] Goodman, L. A. (1981). Association models and canonical correlation in the analysis of cross-classifications having ordered categories. J. Amer. Statist. Assoc. 76 320-334. · doi:10.2307/2287833 [45] Goodman, L. A. (1985). The analysis of cross-classified data having ordered and/or unordered categories: association models, correlation models and asy mmetry models for contingency tables with or without missing entries. Ann. Statist. 13 10-69. · Zbl 0613.62070 · doi:10.1214/aos/1176346576 [46] Goodman, L. A. (1986). Some useful extensions of the usual correspondence analysis approach and the usual log-linear approach in the analysis of contingency tables. Internat. Statist. Rev. 54 243-309. JSTOR: · Zbl 0611.62060 · doi:10.2307/1403053 [47] Goodman, L. A. (1994). On quasi-independence and quasidependence in contingency tables, with special reference to ordinal triangular contingency tables. J. Amer. Statist. Assoc. 89 1059-1063. JSTOR: · Zbl 0825.62497 · doi:10.2307/2290934 [48] Green, P. J. (1981). Peeling bivariate data. In Interpreting Multivariate Data (Barnett, ed.) Wiley, New York. [49] Green, P. J. and Silverman, B. W. (1979). Constructing the convex hull of a set of points in the plane. Computer Journal 22 262-266. · Zbl 0416.68060 · doi:10.1093/comjnl/22.3.262 [50] Greenacre, M. J. (1984). Theory and Applications of Correspondence Analy sis. Academic Press, London. · Zbl 0555.62005 [51] Greenacre, M. J. and Hastie, T. (1987). The geometric interpretation of correspondence analysis. J. Amer. Statist. Assoc. 82 437-447. JSTOR: · Zbl 0622.62006 · doi:10.2307/2289445 [52] Guttman, L. (1941). The quantification of a class of attributes: A theory and a method of scale construction. The Prediction of Personal Adjustment (Horst et al., eds.) Social Science Research Council, New York. [53] Haberman, S. J. (1970). Analy sis of Qualitative Data. New Developments 2. Academic Press, New York. [54] Hartigan, J. A. (1975). Clustering Algorithms. Wiley, New York. · Zbl 0372.62040 [55] Hastie, T., Buja, A. and Tibshirani, R. (1995). Penalized discriminant analysis. Ann. Statist. 23 73-102. · Zbl 0821.62031 · doi:10.1214/aos/1176324456 [56] Hastie, T., Tibshirani, R. and Buja, A. (1994). Flexible discriminant analysis with optimal scoring. J. Amer. Statist. Assoc. 89 1255-1270. JSTOR: · Zbl 0812.62067 · doi:10.2307/2290989 [57] Hay ashi, C. (1952). On the prediction of phenomena from qualitative data and the quantification of qualitative data from the mathematico-statistical point of view. Ann. Inst. Statist. Math. 5 121-143. [58] Heiser, W. J. and Meulman, J. J. (1983). Constrained multidimensional scaling. Applied Psy chological Measurement 7 381-404. · Zbl 0726.92032 · doi:10.1007/BF02294582 [59] Hirschfeld, H. O. (1935). A connection between correlation and contingency. Proc. Cambridge Philos. Soc. 31 520-524. · Zbl 0012.36304 [60] Hoffman, D. L. and de Leeuw, J. (1992). Interpreting multiple correspondence analysis as a multidimensional scaling method. Marketing Letters 3 259-272. [61] Horst, P. (1961). Relations among m sets of measures. Psy chometrika 26 129-149. · Zbl 0099.35801 · doi:10.1007/BF02289710 [62] Horst, P. (1961). Generalized canonical correlations and their application to experimental data. Journal of Clinical Psy chology 17 331-347. [63] Hotelling, H. (1935). The most predictable criterion. Journal of Educational Psy chology 26 139-142. [64] Hotelling, H. (1936). Relations between two sets of variables. Biometrika 28 321-377. · Zbl 0015.40705 · doi:10.1093/biomet/28.3-4.321 [65] Kato, T. (1995). Perturbation Theory of Linear Operators. Springer, Berlin. · Zbl 0836.47009 [66] Kaufman, P. and Bradby, D. (1992). Characteristics of at risk students in NELS:88. Report 92-042, National Center for Education Statistics, Washington, DC. [67] Kendall, M. G. (1980). Multivariate Analy sis, 2nd ed. Griffin, London. [68] Kettenring, J. R. (1971). Canonical analysis of several sets of variables. Biometrika 58 433-460. JSTOR: · Zbl 0225.62072 · doi:10.1093/biomet/58.3.433 [69] Kruskal, J. B. and Shepard, R. N. (1974). A nonmetric variety of linear factor analysis. Psy chometrika 39 123- 157. · Zbl 0295.62064 · doi:10.1007/BF02291465 [70] Kshirsagar, A. N. (1978). Multivariate Analy sis. Dekker, New York. [71] Lazarsfeld, P. F. and Henry, N. W. (1968). Latent Structure Analy sis. Houghton Mifflin, Boston. · Zbl 0182.52201 [72] Leamer, E. E. (1978). Specification Searches: Ad Hoc Inferences from Nonexperimental Data. Wiley, New York. · Zbl 0384.62089 [73] Lebart, L., Morineau, A. and Tabard, N. (1977). Technique de la Description Statistique: Méthodes et Logiciels pour l’Analy se des Grands Tableaux. Dunod, Paris. [74] Liu, R. Y., Singh, K. and Lo, S. (1989). On a representation related to the bootstrap. Sankhy\?a Ser. A 51 168-197. · Zbl 0706.62018 [75] Markus, M. T. (1994). Bootstrap Confidence Regions in Nonlinear Multivariate Analy sis. DSWO Press, Leiden. · Zbl 0879.62051 [76] Meulman, J. J. (1984). Correspondence analysis and stability. Research Report 84-01, Dept. Data Theory, Leiden Univ. [77] Michailidis, G. and de Leeuw, J. (1995). Nonlinear multivariate analysis of NELS:88. UCLA Statistical Series Preprints 175. Univ. California, Los Angeles. [78] Michailidis, G. and de Leeuw, J. (1996). The Gifi sy stem of nonlinear multivariate analysis. UCLA Statistical Series Preprints 204. Univ. California, Los Angeles. [79] Molenaar, I. W. (1988). Formal statistics and informal data analysis, or why laziness should be discouraged. Statist. Neerlandica 42 83-90. [80] National Education Goals Panel (1992). The National Education Goals Report: Building a Nation of Learners. Washington, DC. [81] Nishisato, S. (1980). Analy sis of Categorical Data: Dual Scaling and Its Applications. Toronto Univ. Press. · Zbl 0487.62001 [82] Nishisato, S. (1994). Elements of Dual Scaling. An Introduction to Practical Data Analy sis. Erlbaum, Hillsdale. [83] Nishisato, S. and Nishisato, I. (1984). An Introduction to Dual Scaling. Microstats, Toronto. · Zbl 0855.62047 [84] Oakes, J. (1989). What educational indicators? The case for assessing the school context. Educational Evaluation and Policy Analy sis 11 181-199. [85] Ritov, Y. and Gilula, Z. (1993). Analy sis of contingency tables by correspondence models subject to order constraints. J. Amer. Statist. Assoc. 88 1380-1387. JSTOR: · Zbl 0792.62049 · doi:10.2307/2291280 [86] Roskam, E. E. (1968). Metric Analy sis of Ordinal Data in Psy chology. VAM, Voorschoten. [87] Rutishauser, H. (1969). Computational aspects of F. L. Bauers’s simultaneous iteration method. Numer. Math. 13 4-13. · Zbl 0182.21304 · doi:10.1007/BF02165269 [88] Saporta, G. (1975). Liaisons entre Plusieurs Ensembles de Variables et Codage de Données Qualitatives. Univ. Paris VI, Paris. [89] Schriever, B. F. (1983). Scaling of order dependent categorical variables with correspondence analysis. Internat. Statist. Rev. 51 225-238. JSTOR: · Zbl 0551.62038 · doi:10.2307/1402585 [90] Shao, J. (1992). Some results for differentiable statistical functionals. In Nonparametric Statistics and Related Topics (Saleh, ed.) 179-188. North-Holland, Amsterdam. [91] Shao, J. and Tu, D. (1995). The Jackknife and Bootstrap. Springer, New York. · Zbl 0947.62501 [92] SPSS Inc. SPSS Categories User’s Manual. [93] Steel, R. G. D. (1951). Minimum generalized variance for a set of linear functions. Ann. Math. Statist. 22 456-460. · Zbl 0043.34203 · doi:10.1214/aoms/1177729594 [94] Takane, Y. and Shibay ama, T. (1991). Principal component analysis with external information on both subjects and variables. Psy chometrika 56 97-120. · Zbl 0725.62055 · doi:10.1007/BF02294589 [95] van Buuren, S. (1990). Optimal Scaling of Time Series. DSWO Press, Leiden. [96] van Buuren, S. (1994). Groupals analysis of abiotic measures from an environmental study in the archipelago of Hochelage. Research Report 94-14, Dept. Data Theory, Univ. Leiden. [97] van Buuren, S. and Heiser, W. J. (1989). Clustering N objects into K groups under optimal scaling of variables. Psy chometrika 54 699-706. · doi:10.1007/BF02296404 [98] van de Geer, J. P. (1984). Linear relationships among k sets of variables. Psy chometrika 49 79-94. [99] van der Burg, E. (1985). Homals classification of whales, porpoises and dolphins. In Data Analy sis in Real Life Environment: Ins and Outs of Solving Problems (J.-F. Marcotorchino, J.-M. Proth and J. Janssen, eds.) 25-35. North-Holland, Amsterdam. [100] van der Burg, E. and de Leeuw, J. (1988). Use of the multinomial jackknife and bootstrap in generalized canonical correlation analysis. Appl. Stochastic Models Data Anal. 4 154-172. · Zbl 0800.62307 · doi:10.1002/asm.3150040304 [101] van der Burg, E., de Leeuw, J. and Dijksterhuis, G. [102] . Nonlinear canonical correlation with k sets of variables. Comput. Statist. Data Anal. 18 141-163. · Zbl 0825.62506 [103] van der Burg, E., de Leeuw, J. and Verdegaal, R. [104] . Homogeneity analysis with K sets of variables: an alternating least squares method with optimal scaling features. Psy chometrika 53 177-197. · Zbl 0718.62143 · doi:10.1007/BF02294131 [105] van der Heijden, P. G. M., de Falguerolles, A. and de Leeuw, J. (1989). A combined approach to contingency table analysis using correspondence analysis and loglinear analysis. J. Roy. Statist. Soc. Ser. C 38 249-292. JSTOR: · Zbl 0707.62114 · doi:10.2307/2348058 [106] van Rijckevorsel, J. L. A. (1987). The Application of Fuzzy Coding and Horseshoes in Multiple Correspondence Analy sis. DSWO Press, Leiden. [107] Verdegaal, R. (1986). OVERALS. Research Report UG86-01. Dept. Data Theory, Leiden. [108] Weinberg, S. L., Carroll, J. D. and Cohen, H. S. [109] . Confidence regions for INDSCAL using jackknife and bootstrap techniques. Psy chometrika 49 475-491. [110] Young, F. W., de Leeuw, J. and Takane, Y. (1976). Regression with qualitative variables: an alternating least squares method with optimal scaling features. Psy chometrika 41 505-529. · Zbl 0351.92032 · doi:10.1007/BF02296972 [111] Young, F. W. (1981). Quantitative analysis of qualitative data. Psy chometrika 46 357-388. · Zbl 0479.62003 · doi:10.1007/BF02293796 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.