Sparse estimation via nonconcave penalized likelihood in factor analysis model. (English) Zbl 1332.62194

Summary: We consider the problem of sparse estimation in a factor analysis model. A traditional estimation procedure in use is the following two-step approach: the model is estimated by maximum likelihood method and then a rotation technique is utilized to find sparse factor loadings. However, the maximum likelihood estimates cannot be obtained when the number of variables is much larger than the number of observations. Furthermore, even if the maximum likelihood estimates are available, the rotation technique does not often produce a sufficiently sparse solution. In order to handle these problems, this paper introduces a penalized likelihood procedure that imposes a nonconvex penalty on the factor loadings. We show that the penalized likelihood procedure can be viewed as a generalization of the traditional two-step approach, and the proposed methodology can produce sparser solutions than the rotation technique. A new algorithm via the EM algorithm along with coordinate descent is introduced to compute the entire solution path, which permits the application to a wide variety of convex and nonconvex penalties. Monte Carlo simulations are conducted to investigate the performance of our modeling strategy. A real data example is also given to illustrate our procedure.


62H25 Factor analysis and principal components; correspondence analysis
62H12 Estimation in multivariate analysis
62J07 Ridge regression; shrinkage estimators (Lasso)


sparsenet; R; glmnet
Full Text: DOI arXiv


[1] Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) 2nd International Symposium on Information Theory, pp. 267-281. Akademiai Kiado, Budapest (1973) · Zbl 0283.62006
[2] Akaike, H, Factor analysis and AIC, Psychometrika, 52, 317-332, (1987) · Zbl 0627.62067
[3] Anderson, T., Rubin, H.: Statistical inference in factor analysis. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability vol. 5, pp. 111-150 (1956) · Zbl 0070.14703
[4] Bai, J; Li, K, Statistical analysis of factor models of high dimension, Ann. Stat., 40, 436-465, (2012) · Zbl 1246.62144
[5] Bai, J., Liao, Y.: Efficient estimation of approximate factor models via regularized maximum likelihood. arXiv, preprint arXiv:12095911 (2012) · Zbl 1390.62107
[6] Bozdogan, H, Model selection and akaike’s information criterion (AIC): the general theory and its analytical extensions, Psychometrika, 52, 345-370, (1987) · Zbl 0627.62005
[7] Breheny, P; Huang, J, Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection, Ann. Appl. Stat., 5, 232, (2011) · Zbl 1220.62095
[8] Caner, M.: Selecting the correct number of factors in approximate factor models: the large panel case with bridge estimators. Technical report. (2011) · Zbl 1183.62120
[9] Carvalho, CM; Chang, J; Lucas, JE; Nevins, JR; Wang, Q, West M high-dimensional sparse factor modeling: applications in gene expression genomics, J. American Stat. Assoc., 103, 1438-1456, (2008) · Zbl 1286.62091
[10] Chen, J; Chen, Z, Extended Bayesian information criteria for model selection with large model spaces, Biometrika, 95, 759-771, (2008) · Zbl 1437.62415
[11] Choi, J; Zou, H; Oehlert, G, A penalized maximum likelihood approach to sparse factor analysis, Stat. Interface, 3, 429-436, (2011) · Zbl 1245.62074
[12] Clarke, M, A rapidly convergent method for maximum-likelihood factor analysis, British J. Math. Stat. Psychol., 23, 43-52, (1970) · Zbl 0205.23802
[13] Efron, B, How biased is the apparent error rate of a prediction rule?, J. American Stat. Assoc., 81, 461-470, (1986) · Zbl 0621.62073
[14] Efron, B, The estimation of prediction error: covariance penalties and cross-validation, J. American Stat. Assoc., 99, 619-642, (2004) · Zbl 1117.62324
[15] Efron, B; Hastie, T; Johnstone, I; Tibshirani, R, Least angle regression (with discussion), Ann. Stat., 32, 407-499, (2004) · Zbl 1091.62054
[16] Fan, J; Li, R, Variable selection via nonconcave penalized likelihood and its oracle properties, J. American Stat. Assoc., 96, 1348-1360, (2001) · Zbl 1073.62547
[17] Frank, I; Friedman, J, A statistical view of some chemometrics regression tools, Technometrics, 35, 109-148, (1993) · Zbl 0775.62288
[18] Friedman, J; Hastie, H; Höfling, H; Tibshirani, R, Pathwise coordinate optimization, Ann. Appl. Stat., 1, 302-332, (2007) · Zbl 1378.90064
[19] Friedman, J; Hastie, T; Tibshirani, R, Regularization paths for generalized linear models via coordinate descent, J. Stat. Softw., 33, 1-22, (2010)
[20] Friedman, J, Fast sparse regression and classification, Int. J. Forecast., 28, 722-738, (2012)
[21] Fu, W, Penalized regression: the bridge versus the lasso, J. Comput. Graph. Stat., 7, 397-416, (1998)
[22] Hastie, T; Rosset, S; Tibshirani, R; Zhu, J, The entire regularization path for the support vector machine, J. Mach. Learn. Res., 5, 1391-1415, (2004) · Zbl 1222.68213
[23] Hendrickson, A; White, P, Promax: a quick method for rotation to oblique simple structure, British J. Stat. Psychol., 17, 65-70 , (1964)
[24] Hirose, K; Konishi, S, Variable selection via the weighted group lasso for factor analysis models, Canadian J. Stat., 40, 345-361, (2012) · Zbl 1349.62250
[25] Hirose, K., Tateishi, S., Konishi, S.: Tuning parameter selection in sparse regression modeling. Comput. Stat. Data Anal. (2012) · Zbl 1400.62006
[26] Jennrich, R, Rotation to simple loadings using component loss functions: the orthogonal case, Psychometrika, 69, 257-273, (2004) · Zbl 1306.62440
[27] Jennrich, R, Rotation to simple loadings using component loss functions: the oblique case, Psychometrika, 71, 173-191, (2006) · Zbl 1306.62442
[28] Jennrich, R; Robinson, S, A Newton-raphson algorithm for maximum likelihood factor analysis, Psychometrika, 34, 111-123, (1969)
[29] Jöreskog, K, Some contributions to maximum likelihood factor analysis, Psychometrika, 32, 443-482, (1967) · Zbl 0183.24603
[30] Kaiser, H, The varimax criterion for analytic rotation in factor analysis, Psychometrika, 23, 187-200, (1958) · Zbl 0095.33603
[31] Kato, K, On the segrees of dreedom in shrinkage estimation, J. Multivar. Anal., 100, 1338-1352, (2009) · Zbl 1162.62067
[32] Kiers, HA, Simplimax, oblique rotation to an optimal target with simple structure, Psychometrika, 59, 567-579, (1994) · Zbl 0925.62228
[33] Mazumder, R; Friedman, J; Hastie, T, Sparsenet: coordinate descent with nonconvex penalties, J. American Stat. Assoc., 106, 1125-1138, (2011) · Zbl 1229.62091
[34] Mulaik, S.: The Foundations of Factor Analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton (2010) · Zbl 1188.62185
[35] Ning, L., Georgiou, T.T.: Sparse factor analysis via likelihood and \(ℓ _1\) regularization. In: Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference, pp 5188-5192 (2011)
[36] R Development Core Team.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org (2010), ISBN 3-900051-07-0 · Zbl 0775.62288
[37] Rubin, D; Thayer, D, EM algorithms for ML factor analysis, Psychometrika, 47, 69-76, (1982) · Zbl 0483.62046
[38] Schwarz, G, Estimation of the Mean of a multivariate normal distribution, Ann. Stat., 9, 1135-1151, (1978) · Zbl 0476.62035
[39] Stock, JH; Watson, MW, Forecasting using principal components from a large number of predictors, J. American Stat. Assoc., 97, 1167-1179, (2002) · Zbl 1041.62081
[40] Thurstone, L.L.: Multiple Factor Analysis. University of Chicago Press, Chicago (1947) · Zbl 0029.22203
[41] Tibshirani, R, Regression shrinkage and selection via the lasso, J. Royal Stat. Soc. Ser. B, 58, 267-288, (1996) · Zbl 0850.62538
[42] Tipping, ME; Bishop, CM, Probabilistic principal component analysis, J. Royal Stat. Soc., 61, 611-622, (1999) · Zbl 0924.62068
[43] Ulfarsson, M.O., Solo, V.: Sparse variable principal component analysis with application to fmri. In: Proceedings of the 4th IEEE International Symposium on Biomedical Imaging from Nano to Macro, ISBI 2007, pp 460-463 (2007)
[44] Xie, S., Krishnan, S., Lawniczak, A.T.: Sparse principal component extraction and classification of long-term biomedical signals. In: Proceedings of the IEEE 25th International Symposium on Computer-Based Medical Systems (CBMS), pp 1-6 (2012) · Zbl 1183.62120
[45] Ye, J, On measuring and correcting the effects of data mining and model selection, J. American Stat. Assoc., 93, 120-131, (1998) · Zbl 0920.62056
[46] Yoshida, R., West, M.: Bayesian learning in sparse graphical factor models via variational mean-field annealing. J. Mach. Learn. Res. 99, 1771-1798 (2010) · Zbl 1242.68261
[47] Yuan, M; Lin, Y, Model selection and estimation in the Gaussian graphical model, Biometrika, 94, 19-35, (2007) · Zbl 1142.62408
[48] Zhang, C, Nearly unbiased variable selection under minimax concave penalty, Ann. Stat., 38, 894-942, (2010) · Zbl 1183.62120
[49] Zhao, P; Yu, B, On model selection consistency of lasso, J. Mach. Learn. Res., 7, 2541, (2007) · Zbl 1222.62008
[50] Zou, H, The adaptive lasso and its oracle properties, J. American Stat. Assoc., 101, 1418-1429, (2006) · Zbl 1171.62326
[51] Zou, H; Li, R, One-step sparse estimates in nonconcave penalized likelihood models, Ann. Stat., 36, 1509, (2008) · Zbl 1142.62027
[52] Zou, H; Hastie, T; Tibshirani, R, Sparse principal component analysis, J. Comput. Graph. Stat., 15, 265-286, (2006)
[53] Zou, H; Hastie, T; Tibshirani, R, On the degrees of freedom of the lasso, Ann. Stat., 35, 2173-2192, (2007) · Zbl 1126.62061
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.