×

Adaptive shrinkage of singular values. (English) Zbl 1505.62207

Summary: To recover a low-rank structure from a noisy matrix, truncated singular value decomposition has been extensively used and studied. Recent studies suggested that the signal can be better estimated by shrinking the singular values as well. We pursue this line of research and propose a new estimator offering a continuum of thresholding and shrinking functions. To avoid an unstable and costly cross-validation search, we propose new rules to select two thresholding and shrinking parameters from the data. In particular we propose a generalized Stein unbiased risk estimation criterion that does not require knowledge of the variance of the noise and that is computationally fast. A Monte Carlo simulation reveals that our estimator outperforms the tested methods in terms of mean squared error on both low-rank and general signal matrices across different signal-to-noise ratio regimes. In addition, it accurately estimates the rank of the signal when it is detectable.

MSC:

62-08 Computational methods for problems pertaining to statistics
62H25 Factor analysis and principal components; correspondence analysis
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Baik, J., Silverstein, J.: Eigenvalues of large sample covariance matrices of spiked population models. J. Multivar. Anal. 97(6), 1382-1408 (2006) · Zbl 1220.15011 · doi:10.1016/j.jmva.2005.08.003
[2] Cai, J., Candes, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optimiz. 20(4), 1956-1982 (2010) · Zbl 1201.90155 · doi:10.1137/080738970
[3] Candes, E.J., Li, X., Ma, Y., Wright, J.: Robust principle component analysis? J. ACM. 58, 1-37 (2009) · Zbl 1327.62369 · doi:10.1145/1970392.1970395
[4] Candes, E.J., Sing-Long, C.A., Trzasko, J.D.: Unbiased risk estimates for singular value thresholding and spectral estimators. IEEE Trans. Signal Process. 61(19), 4643-4657 (2013) · Zbl 1393.94187 · doi:10.1109/TSP.2013.2270464
[5] Caussinus, H.; Leeuw, J. (ed.), Models and uses of principal component analysis (with discussion) (1986), Leiden, The Netherlands
[6] Chatterjee, S.: Matrix estimation by universal singular value thresholding. arXiv:1212.1247 (2013) · Zbl 1308.62038
[7] Chen, K., Dong, H., Chan, K.-S.: Reduced rank regression via adaptive nuclear norm penalization. Biometrika 100(4), 901-920 (2013) · Zbl 1279.62115 · doi:10.1093/biomet/ast036
[8] Craven, P., Wahba, G.: Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik 31, 377-403 (1979) · Zbl 0377.65007 · doi:10.1007/BF01404567
[9] de Leeuw, J.D., Mooijaart, A., Leeden, M.: Fixed Factor Score Models with Linear Restrictions. University of Leiden, Leiden, The Netherlands (1985)
[10] Donoho, D.L., Johnstone, I.M.: Ideal spatial adaptation via wavelet shrinkage. Biometrika 81, 425-455 (1994) · Zbl 0815.62019 · doi:10.1093/biomet/81.3.425
[11] Donoho, D.L., Gavish, M.: The optimal hard threshold for singular values is \[4/\sqrt{3}\sqrt{3} \]. IEEE Trans. Inf. Theory 60(8), 5040-5053 (2014a) · Zbl 1360.94071 · doi:10.1109/TIT.2014.2323359
[12] Donoho, D.L., Gavish, M.: Minimax risk of matrix denoising by singular value thresholding. Ann. Statist. 42(6), 2413-2440 (2014b) · Zbl 1310.62014
[13] Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1(3), 211-218 (1936) · JFM 62.1075.02 · doi:10.1007/BF02288367
[14] Gaiffas, S., Lecue, G.: Weighted algorithms for compressed sensing and matrix completion. arXiv:1107.1638 (2011) · Zbl 1365.62137
[15] Gavish, M., Donoho, D.L.: Optimal shrinkage of singular values. arXiv:1405.7511v2 (2014) · Zbl 1366.94100
[16] Hoff, P.D.: Equivariant and scale-free tucker decomposition models. arXiv:1312.6397 (2013) · Zbl 1359.62221
[17] Hoff, P.D.: Model averaging and dimension selection for the singular value decomposition. J. Am. Statist. Assoc. 102(478), 674-685 (2007) · Zbl 1172.62318 · doi:10.1198/016214506000001310
[18] Huber, P.J.: Robust Statistics. Wiley, New York (1981) · Zbl 0536.62025 · doi:10.1002/0471725250
[19] Ilin, A., Raiko, T.: Practical approaches to principal component analysis in the presence of missing values. J. Mach. Learn. Res. 11, 1957-2000 (2010) · Zbl 1242.62047
[20] Johnstone, I.: On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29(2), 295-327 (2001) · Zbl 1016.62078 · doi:10.1214/aos/1009210544
[21] Josse, J., Husson, F.: Selecting the number of components in PCA using cross-validation approximations. Computat. Statistist. Data Anal. 56(6), 1869-1879 (2012) · Zbl 1243.62082 · doi:10.1016/j.csda.2011.11.012
[22] Josse, J., Husson, F.: Handling missing values in exploratory multivariate data analysis methods. J. Société Française Statistique 153(2), 1-21 (2012) · Zbl 1316.62006
[23] Lê, S., Josse, J., Husson, F.: Factominer: an R package for multivariate analysis. J. Statist. Softw. 25(1), 1-18 (2008). 3 · doi:10.18637/jss.v025.i01
[24] Ledoit, O., Wolf, M.: Nonlinear shrinkage estimation of large-dimensional covariance matrices. Ann. Statist. 40, 1024-1060 (2012) · Zbl 1274.62371 · doi:10.1214/12-AOS989
[25] Mandel, J.: The partitioning of interaction in analysis of variance. J. Res. Natl. Bur. Stand. B 73, 309-328 (1969) · Zbl 0195.17404 · doi:10.6028/jres.073B.031
[26] Mazumder, R., Hastie, T., Tibshirani, R.: Spectral regularization algorithms for learning large incomplete matrices. J. Mach. Learn. Res. 99, 2287-2322 (2010) · Zbl 1242.68237
[27] Owen, A.B., Perry, P.O.: Bi-cross-validation of the svd and the nonnegative matrix factorization. Ann. Appl. Statist. 3(2), 564-594 (2009) · Zbl 1166.62047 · doi:10.1214/08-AOAS227
[28] Paul, D.: Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica 17(4), 1617 (2007) · Zbl 1134.62029
[29] Rao, N.R.: Optshrink-low-rank signal matrix denoising via optimal, data-driven singular value shrinkage. arXiv:1306.6042 (2013) · Zbl 1242.62047
[30] Sardy, S.: Blockwise and coordinatewise thresholding to combine tests of different natures in modern anova. arXiv:1302.6073 (2013)
[31] Sardy, S., Bruce, A.G., Tseng, P.: Block coordinate relaxation methods for nonparametric wavelet denoising. J. Computat. Graphic. Statist. 9, 361-379 (2000)
[32] Sardy, S., Tseng, P., Bruce, A.G.: Robust wavelet denoising. IEEE Trans. Signal Process. 49, 1146-1152 (2001) · doi:10.1109/78.923297
[33] Sardy, S.: Smooth blockwise iterative thresholding: a smooth fixed point estimator based on the likelihood’s block gradient. J. Am. Statist. Assoc. 107, 800-813 (2012) · Zbl 1328.62218 · doi:10.1080/01621459.2012.664527
[34] Shabalin, A.A., Nobel, B.: Reconstruction of a low-rank matrix in the presence of Gaussian noise. J. Multivar. Anal. 118, 67-76 (2013) · Zbl 1280.15022 · doi:10.1016/j.jmva.2013.03.005
[35] Stein, C.M.: Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9, 1135-1151 (1981) · Zbl 0476.62035 · doi:10.1214/aos/1176345632
[36] Talebi, H., Milanfar, P.: Global image denoising. IEEE Trans. Image. Process. 23(2), 755-768 (2014) · Zbl 1374.94365 · doi:10.1109/TIP.2013.2293425
[37] Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Statist. Soc. B Methodol. 58, 267-288 (1996) · Zbl 0850.62538
[38] Verbanck, M., Josse, J., Husson, F.: Regularized PCA to denoise and visualize data. Statist. Comput. 25(2), 471-486 (2015) · Zbl 1331.62298
[39] Zanella, A., Chiani, M., Win, M.Z.: On the marginal distribution of the eigenvalues of Wishart matrices. IEEE Trans. Commun. 57(4), 1050-1060 (2009)
[40] Zhang, C.H., Huang, J.: The sparsity and biais of the lasso selection in high-dimensional linear regression. Ann. Statist. 36(4), 1567-1594 (2008) · Zbl 1142.62044 · doi:10.1214/07-AOS520
[41] Zou, H.: The adaptive LASSO and its oracle properties. J. Am. Statist. Assoc. 101, 1418-1429 (2006) · Zbl 1171.62326 · doi:10.1198/016214506000000735
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.