zbMATH — the first resource for mathematics

Consistency of AIC and BIC in estimating the number of significant components in high-dimensional principal component analysis. (English) Zbl 1395.62119
This paper deals with the problem of estimating the number of significant components in principal component analysis (PCA), which is known as the dimensionality in PCA. Specifically, let \(y_{1}\),…,\(y_{n}\) be a random sample of size \(n\) from a \(p\)-dimensional population with mean \(\mu\) and covariance matrix \(\Sigma\). The problem of estimating the dimensionality is considered as a problem of selecting an appropriate model from the set \(\{M_{0}, M_{1},\dots,M_{p-1}\}\), where \[ M_{k}=\lambda_{k}>\lambda_{k+1}=\dots=\lambda_{p}=\lambda, \] with \(\lambda_{1}\geqq\dots\geqq\lambda_{p}\) the population eigenvalues of the covariance matrix \(\Sigma\). In this context, the authors consider two estimation criteria, AIC [H. Akaike, in: 2nd International Symposium on Information Theory, Tsahkadsor 1971, 267–281 (1973; Zbl 0283.62006)] and BIC [G. Schwarz, Ann. Stat. 6, 461–464 (1978; Zbl 0379.62005)], and their purpose is to examine the consistency of the estimation criteria under a high-dimensional framework where \(p,n\rightarrow \infty\) such that \(p/n\rightarrow c>0\). It is assumed that the number of significant components, say \(k\), is fixed; that the number of candidate models is greater than \(k\) and that the fourth population moment is finite. Both the cases of \(p<n\) (\(0<c<1\)) and \(p>n\) (\(c>1\)) are discussed. In this last case, modified AIC and BIC criteria given on p. 1060 are considered. The main results of the paper are obtained by techniques from random matrix theory and are summarized as follows:
For \(0<c<1\), if \(\lambda_1\) is bounded then under the so-called gap condition (C3) given on p. 1057 of the paper, AIC is strongly consistent, but BIC is not. Furthermore, if \(\lambda_k\rightarrow \infty\) AIC is always strongly consistent regardless of whether the gap condition holds, while if \(\lambda_k/\log n\rightarrow \infty\) then BIC is strongly consistent.
For \(c>1\), if \(\lambda_1\) is bounded then under the so-called modified gap condition (C5) given on p. 1060 of the paper, the modified AIC is strongly consistent, but the modified BIC is not. Furthermore, if \(\lambda_k\rightarrow \infty\) the modified AIC is always strongly consistent regardless of whether the modified gap condition holds, while if \(\lambda_k/\log n\rightarrow \infty\) then the modified BIC is strongly consistent.
Finally, simulation studies show that the sufficient conditions given are essential.

62H25 Factor analysis and principal components; correspondence analysis
62E20 Asymptotic distribution theory in statistics
Full Text: DOI Euclid
[1] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In 2 nd International Symposium on Information Theory (B. N. Petrov and F. Csáki, eds.) 267-281, Budapest: Akadémia Kiado. · Zbl 0283.62006
[2] Bai, Z. D., Miao, B. Q. and Rao, C. R. (1990). Estimation of direction of arrival of signals: Asymptotic results. In Advances in Spectrum Analysis and Array Processing (S. Haykins, ed.) 2 327-347. Prentice Hall, Englewood, Cliffs, NJ.
[3] Bai, Z. D. and Silverstein, J. W. (1998). No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices. Ann. Probab.26 316-345. · Zbl 0937.60017
[4] Bai, Z. and Yao, J. (2012). On sample eigenvalues in a generalized spiked population model. J. Multivariate Anal.106 167-177. · Zbl 1301.62049
[5] Bai, Z. D. and Yin, Y. Q. (1993). Limit of the smallest eigenvalue of a large-dimensional sample covariance matrix. Ann. Probab.21 1275-1294. · Zbl 0779.60026
[6] Ferré, L. (1995). Selection of components in principal component analysis: A comparison of methods. Comput. Statist. Data Anal.19 669-682.
[7] Fujikoshi, Y. and Sakurai, T. (2016a). Some properties of estimation criteria for dimensionality in principal component anlysis. Amer. J. Math. Management Sci.35 133-142.
[8] Fujikoshi, Y. and Sakurai, T. (2016b). High-dimensional consistency of rank estimation criteria in multivariate linear model. J. Multivariate Anal.149 199-212. · Zbl 1341.62119
[9] Fujikoshi, Y., Sakurai, T. and Yanagihara, H. (2014). Consistency of high-dimensional AIC-type and \(C_{p}\)-type criteria in multivariate linear regression. J. Multivariate Anal.123 184-200. · Zbl 1360.62265
[10] Fujikoshi, Y., Ulyanov, V. V. and Shimizu, R. (2010). Multivariate Statistics: High-Dimensional and Large-Sample Approximations. Wiley, Hoboken, NJ. · Zbl 1304.62016
[11] Fujikoshi, Y., Yamada, T., Watanabe, D. and Sugiyama, T. (2007). Asymptotic distribution of the LR statistic for equality of the smallest eigenvalues in high-dimensional principal component analysis. J. Multivariate Anal.98 2002-2008. · Zbl 1133.62037
[12] Gunderson, B. K. and Muirhead, R. J. (1997). On estimating the dimensionality in canonical correlation analysis. J. Multivariate Anal.62 121-136. · Zbl 0874.62066
[13] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist.29 295-327. · Zbl 1016.62078
[14] Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc.104 682-693. · Zbl 1388.62174
[15] Jolliffe, I. T. (2002). Principal Component Analysis, 2nd ed. Springer, New York. · Zbl 1011.62064
[16] Jolliffe, I. T., Trendafilov, N. T. and Uddin, M. (2003). A modified principal component technique based on the LASSO. J. Comput. Graph. Statist.12 531-547.
[17] Kim, Y., Kwon, S. and Choi, H. (2012). Consistent model selection criteria on high dimensions. J. Mach. Learn. Res.13 1037-1057. · Zbl 1283.62143
[18] Nishii, R. (1984). Asymptotic properties of criteria for selection of variables in multiple regression. Ann. Statist.12 758-765. · Zbl 0544.62063
[19] Nishii, R., Bai, Z. D. and Krishnaiah, P. R. (1988). Strong consistency of the information criterion for model selection in multivariate analysis. Hiroshima Math. J.18 451-462. · Zbl 0678.62064
[20] Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statist. Sinica 17 1617-1642. · Zbl 1134.62029
[21] Rao, C. R. and Rao, M. B. (1998). Matrix Algebra and Its Applications to Statistics and Econometrics. World Scientific, River Edge, NJ. · Zbl 0915.15001
[22] Schott, J. R. (2006). A high-dimensional test for the equality of the smallest eigenvalues of a covariance matrix. J. Multivariate Anal.97 827-843. · Zbl 1086.62072
[23] Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist.6 461-464. · Zbl 0379.62005
[24] Shao, J. (1997). An asymptotic theory for linear model selection. Statist. Sinica 7 221-264. · Zbl 1003.62527
[25] Shibata, R. (1976). Selection of the order of an autoregressive model by Akaike’s information criterion. Biometrika 63 117-126. · Zbl 0358.62048
[26] Silverstein, J. W. (1995). Strong convergence of the empirical distribution of eigenvalues of large-dimensional random matrices. J. Multivariate Anal.55 331-339. · Zbl 0851.62015
[27] Yanagihara, H., Wakaki, H. and Fujikoshi, Y. (2015). A consistency property of the AIC for multivariate linear models when the dimension and the sample size are large. Electron. J. Stat.9 869-897. · Zbl 1328.62455
[28] Yang, Y. (2005). Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation. Biometrika 92 937-950. · Zbl 1151.62301
[29] Zhao, L. C., Krishnaiah, P. R. and Bai, Z. D. (1986). On detection of the number of signals in presence of white noise. J. Multivariate Anal.20 1-25. · Zbl 0617.62055
[30] Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal component analysis. J. Comput. Graph. Statist.15 265-286.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.