×

High-dimensional disjoint factor analysis with its EM algorithm version. (English) Zbl 1477.62157

Summary: In [Adv. Data Anal. Classif., ADAC 11, No. 3, 563–591 (2017; Zbl 1414.62222)], M. Vichi proposed disjoint factor analysis (DFA), which is a factor analysis procedure subject to the constraint that variables are mutually disjoint. That is, in the DFA solution, each variable loads only a single factor among multiple ones. It implies that the variables are clustered into exclusive groups. Such variable clustering is considered useful for high-dimensional data with variables much more than observations. However, the feasibility of DFA for high-dimensional data has not been considered in [Vichi, loc. cit.]. Thus, one purpose of this paper is to show the feasibility and usefulness of DFA for high-dimensional data. Another purpose is to propose a new computational procedure for DFA, in which an EM algorithm is used. This procedure is called EM-DFA in particular, which can serve the same original purpose as in [Vichi, loc. cit.] but more efficiently. Numerical studies demonstrate that both DFA and EM-DFA can cluster variables fairly well, with EM-DFA more computationally efficient.

MSC:

62H25 Factor analysis and principal components; correspondence analysis

Citations:

Zbl 1414.62222
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Adachi, K., Factor analysis with EM algorithm never gives improper solutions when sample covariance and initial parameter matrices are proper, Psychometrika, 78, 380-394 (2013) · Zbl 1284.62679 · doi:10.1007/s11336-012-9299-8
[2] Adachi, K.; Sakata, T., Three-way principal component analysis with its applications to psychology, Applied matrix and tensor variate data analysis, 1-21 (2016), Springer
[3] Adachi, K. (2019). Factor analysis: Latent variable, matrix decomposition, and constrained uniqueness formulations. WIREs Computational Statistics, doi:10.1002/wics.1458. Accessed 19 Mar 2019
[4] Adachi, K.; Trendafilov, NT, Sparse principal component analysis subject to prespecified cardinality of loadings, Computational Statistics, 31, 1403-1427 (2016) · Zbl 1348.65014 · doi:10.1007/s00180-015-0608-4
[5] Adachi, K.; Trendafilov, NT, Sparsest factor analysis for clustering variables: A matrix decomposition approach, Advances in Data Analysis and Classification, 12, 559-585 (2018) · Zbl 1416.62319 · doi:10.1007/s11634-017-0284-z
[6] Adachi, K.; Trendafilov, NT, Some mathematical properties of the matrix decomposition solution in factor analysis, Psychometrika, 83, 407-424 (2018) · Zbl 1391.62102 · doi:10.1007/s11336-017-9600-y
[7] Akaike, H., Factor analysis and AIC, Psychometrika, 52, 317-332 (1987) · Zbl 0627.62067 · doi:10.1007/BF02294359
[8] Bartholomew, D.; Knott, M.; Moustaki, I., Latent variable models and factor analysis: A unified approach (Third Edition) (2011), Wiley · Zbl 1266.62040 · doi:10.1002/9781119970583
[9] Dempster, AP; Laird, NM; Rubin, DB, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society: Series B, 39, 1-38 (1977) · Zbl 0364.62022
[10] Gan, G.; Ma, C.; Wu, J., Data clustering: Theory, algorithms, and applications (2007), Society of Industrial and Applied Mathematics (SIAM) · Zbl 1185.68274 · doi:10.1137/1.9780898718348
[11] Guttman, L., Some necessary conditions for common-factor analysis, Psychometrika, 19, 149-160 (1954) · Zbl 0058.13004 · doi:10.1007/BF02289162
[12] Hirose, K.; Yamamoto, M., Sparse estimation via nonconcave penalized likelihood in factor analysis model, Statistics and Computing, 25, 863-875 (2015) · Zbl 1332.62194 · doi:10.1007/s11222-014-9458-0
[13] Jöreskog, KG, Some contributions to maximum likelihood factor analysis, Psychometrika, 32, 443-482 (1967) · Zbl 0183.24603 · doi:10.1007/BF02289658
[14] Kaiser, HF, The application of electronic computers to factor analysis, Educational and Psychological Measurements, 20, 141-151 (1960) · doi:10.1177/001316446002000116
[15] Koch, I., Analysis of multivariate and high-dimensional data (2014), Cambridge University Press · Zbl 1307.62003
[16] Konishi, S.; Kitagawa, G., Information criteria and statistical modeling (2007), Springer · Zbl 1172.62003
[17] Osgood, CE; Suci, GJ; Tannenbaum, PH, The measurement of meaning (1957), University of Illinois Press
[18] Rubin, DB; Thayer, DT, EM algorithms for ML factor analysis, Psychometrika, 47, 69-76 (1982) · Zbl 0483.62046 · doi:10.1007/BF02293851
[19] Seber, GAF, A matrix handbook for statisticians (2008), Wiley · Zbl 1143.15001
[20] Stegeman, A., A new method for simultaneous estimation of the factor model parameters, factor scores, and unique parts, Computational Statistics & Data Analysis, 99, 189-203 (2016) · Zbl 1468.62181 · doi:10.1016/j.csda.2016.01.012
[21] Vichi, M., Disjoint factor analysis with cross-loadings, Advances in Data Analysis and Classification, 11, 563-591 (2017) · Zbl 1414.62222 · doi:10.1007/s11634-016-0263-9
[22] Vichi, M.; Saporta, G., Clustering and disjoint principal component analysis with cross-loadings, Computational Statistics & Data Analysis, 53, 3194-3208 (2009) · Zbl 1453.62230 · doi:10.1016/j.csda.2008.05.028
[23] Yanai, H.; Ichikawa, M.; Rao, CR; Sinharay, S., Factor analysis, Handbook of statistics, vol. 26: Psychometrics, 257-296 (2007), Elsevier · Zbl 1460.62196
[24] Yeung, KY; Ruzzo, WL, Principal component analysis for clustering gene expression data, Bioinformatics, 17, 763-774 (2001) · doi:10.1093/bioinformatics/17.9.763
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.