×

The SIMCLAS model: simultaneous analysis of coupled binary data matrices with noise heterogeneity between and within data blocks. (English) Zbl 1284.62766

Summary: In many research domains different pieces of information are collected regarding the same set of objects. Each piece of information constitutes a data block, and all these (coupled) blocks have the object mode in common. When analyzing such data, an important aim is to obtain an overall picture of the structure underlying the whole set of coupled data blocks. A further challenge consists of accounting for the differences in information value that exist between and within (i.e., between the objects of a single block) data blocks. To tackle these issues, analysis techniques may be useful in which all available pieces of information are integrated and in which at the same time noise heterogeneity is taken into account. For the case of binary coupled data, however, only methods exist that go for a simultaneous analysis of all data blocks but that do not account for noise heterogeneity. Therefore, in this paper, the SIMCLAS model, being a hierarchical classes model for the simultaneous analysis of coupled binary two-way matrices, is presented. In this model, noise heterogeneity between and within the data blocks is accounted for by downweighting entries from noisy blocks/objects within a block. In a simulation study it is shown that (1) the SIMCLAS technique recovers the underlying structure of coupled data to a very large extent, and (2) the SIMCLAS technique outperforms a hierarchical classes technique in which all entries contribute equally to the analysis (i.e., noise homogeneity within and between blocks). The latter is also demonstrated in an application of both techniques to empirical data on categorization of semantic concepts.

MSC:

62P15 Applications of statistics to psychology

Software:

Tucker3-HICLAS
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aarts, E.H.L., Korst, J.H.M., & van Laarhoven, P.J.M. (1997). Simulated annealing. In E.H.L. Aarts & J.K. Lenstra (Eds.), Local search in combinatorial optimization (pp. 91–120). Chichester: Wiley. · Zbl 0905.90140
[2] Barbut, M., & Monjardet, B. (1970). Ordre et classification : Algèbre et combinatoire. Paris: Hachette. · Zbl 0267.06001
[3] Birkhoff, G. (1940). Lattice theory. Providence: Am. Math. Soc. · Zbl 0063.00402
[4] Ceulemans, E., & Storms, G. (2010). Detecting intra- and inter-categorical structure in semantic concepts using hiclas. Acta Psychologica, 133, 296–304. · doi:10.1016/j.actpsy.2009.11.011
[5] Ceulemans, E., & Van Mechelen, I. (2003). Uniqueness of n-way n-mode hierarchical classes models. Journal of Mathematical Psychology, 47, 259–264. · Zbl 1052.91075 · doi:10.1016/S0022-2496(03)00002-6
[6] Ceulemans, E., & Van Mechelen, I. (2004). Tucker2 hierarchical classes analysis. Psychometrika, 69, 375–399. · Zbl 1306.62391 · doi:10.1007/BF02295642
[7] Ceulemans, E., & Van Mechelen, I. (2005). Hierarchical classes models for three-way three-mode binary data: Interrelations and model selection. Psychometrika, 70, 461–480. · Zbl 1306.62392 · doi:10.1007/s11336-003-1067-3
[8] Ceulemans, E., Van Mechelen, I., & Leenen, I. (2003). Tucker3 hierarchical classes analysis. Psychometrika, 68, 413–433. · Zbl 1306.62393 · doi:10.1007/BF02294735
[9] Ceulemans, E., Van Mechelen, I., & Leenen, I. (2007). The local minima problem in hierarchical classes analysis: An evaluation of a simulated annealing algorithm and various multistart procedures. Psychometrika, 72, 377–391. · Zbl 1286.62102 · doi:10.1007/s11336-007-9000-9
[10] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46. · doi:10.1177/001316446002000104
[11] De Boeck, P., & Rosenberg, S. (1988). Hierarchical classes: Model and data analysis. Psychometrika, 53, 361–381. · Zbl 0718.62001 · doi:10.1007/BF02294218
[12] De Deyne, S., Verheyen, S., Ameel, E., Vanpaemel, W., Dry, M., Voorspoels, W., & Storms, G. (2008). Exemplar by feature applicability matrices and other Dutch normative data for semantic concepts. Behavioral Research Methods, 40, 1030–1048. · doi:10.3758/BRM.40.4.1030
[13] Haggard, E.A. (1958). Intraclass correlation and the analysis of variance. New York: Dryden. · Zbl 0097.34501
[14] Kiers, H.A.L. (2000). Towards a standardized notation and terminology in multiway analysis. Journal of Chemometrics, 14, 105–122. · doi:10.1002/1099-128X(200005/06)14:3<105::AID-CEM582>3.0.CO;2-I
[15] Kiers, H.A.L., &amp; ten Berge, J.M.F. (1989). Alternating least squares algorithms for simultaneous components analysis with equal component weight matrices for all populations. Psychometrika, 54, 467–473. · Zbl 04561078 · doi:10.1007/BF02294629
[16] Kiers, H.A.L., &amp; ten Berge, J.M.F. (1994). Hierarchical relations between methods for simultaneous component analysis and a technique for rotation to a simple simultaneous structure. British Journal of Mathematical &amp; Statistical Psychology, 47, 109–126. · Zbl 0825.62512 · doi:10.1111/j.2044-8317.1994.tb01027.x
[17] Kirk, R.E. (1982). Experimental design: Procedures for the behavioral sciences (2nd ed.). Belmont: Brooks/Cole. · Zbl 0414.62054
[18] Kirkpatrick, S., Gelatt, C.D.J., &amp; Vecchi, M.P. (1983). Optimization by simulated annealing. Science, 220, 671–680. · Zbl 1225.90162 · doi:10.1126/science.220.4598.671
[19] Leenen, I., &amp; Van Mechelen, I. (2001). An evaluation of two algorithms for hierarchical classes analysis. Journal of Classification, 18, 57–80. · Zbl 1040.91086 · doi:10.1007/s00357-001-0005-2
[20] Leenen, I., Van Mechelen, I., De Boeck, P., &amp; Rosenberg, S. (1999). indclas: A three-way hierarchical classes model. Psychometrika, 64, 9–24. · Zbl 04555250 · doi:10.1007/BF02294316
[21] Leenen, I., Van Mechelen, I., Gelman, A., &amp; De Knop, S. (2008). Bayesian hierarchical classes analysis. Psychometrika, 73, 39–64. · Zbl 1143.62094 · doi:10.1007/s11336-007-9038-8
[22] Millsap, R.E., &amp; Meredith, W. (1988). Component analysis in cross-sectional and longitudinal data. Psychometrika, 53, 123–134. · Zbl 0718.62133 · doi:10.1007/BF02294198
[23] ten Berge, J.M.F., Kiers, H.A.L., &amp; van der Stel, V. (1992). Simultaneous components analysis. Statistica Applicata, 4, 377–392.
[24] Timmerman, M.E., &amp; Kiers, H.A.L. (2003). Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences. Psychometrika, 68, 105–121. · Zbl 1306.62507 · doi:10.1007/BF02296656
[25] Van Deun, K., Smilde, A.K., van der Werf, M.J., Kiers, H.A.L., &amp; Van Mechelen, I. (2009). A structured overview of simultaneous component based data integration. BMC Bioinformatics, 10, 246. · Zbl 05739452 · doi:10.1186/1471-2105-10-246
[26] Van Mechelen, I., De Boeck, P., &amp; Rosenberg, S. (1995). The conjunctive model of hierarchical classes. Psychometrika, 60, 505–521. · Zbl 0864.92021 · doi:10.1007/BF02294326
[27] Van Mechelen, I., &amp; Smilde, A.K. (2009). A generic model for data fusion. Paper presented at the 6th meeting of TRICAP (Three-way methods in chemistry and psychology), June 14–19, Vall de Núria, Spain.
[28] Van Mechelen, I., &amp; Smilde, A.K. (2010). A generic linked-mode decomposition model for data fusion. Chemometrics and Intelligent Laboratory Systems, 104, 83–94. · doi:10.1016/j.chemolab.2010.04.012
[29] Wilderjans, T.F., Ceulemans, E., &amp; Van Mechelen, I. (2008). The chic model: global model for coupled binary data. Psychometrika, 73, 729–751. · Zbl 1284.62767 · doi:10.1007/s11336-008-9069-9
[30] Wilderjans, T.F., Ceulemans, E., Van Mechelen, I., &amp; van den Berg, R.A. (2011). Simultaneous analysis of coupled data matrices subject to different amounts of noise. British Journal of Mathematical &amp; Statistical Psychology, 64, 277–290. · Zbl 05932562 · doi:10.1348/000711010X513263
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.