×

Reconstruction of a low-rank matrix in the presence of Gaussian noise. (English) Zbl 1280.15022

Summary: This paper addresses the problem of reconstructing a low-rank signal matrix observed with additive Gaussian noise. We first establish that, under mild assumptions, one can restrict attention to orthogonally equivariant reconstruction methods, which act only on the singular values of the observed matrix and do not affect its singular vectors. Using recent results in random matrix theory, we then propose a new reconstruction method that aims to reverse the effect of the noise on the singular value decomposition of the signal matrix. In conjunction with the proposed reconstruction method we also introduce a Kolmogorov-Smirnov based estimator of the noise variance.{
} We show with an extensive simulation study that the proposed method outperforms oracle versions of both soft and hard thresholding methods, and closely matches the performance of the oracle orthogonally equivariant method.

MSC:

15B52 Random matrices (algebraic aspects)
60B20 Random matrices (probabilistic aspects)
15A83 Matrix completion problems
15A18 Eigenvalues, singular values, and eigenvectors
62H25 Factor analysis and principal components; correspondence analysis

Software:

SVDMAN; impute
PDF BibTeX XML Cite
Full Text: DOI arXiv Link

References:

[1] Alter, O.; Brown, P.; Botstein, D., Singular value decomposition for genome-wide expression data processing and modeling, Proceedings of the National Academy of Sciences, 97, 18, 10101, (2000)
[2] Baik, J.; Silverstein, J., Eigenvalues of large sample covariance matrices of spiked population models, Journal of Multivariate Analysis, 97, 6, 1382-1408, (2006) · Zbl 1220.15011
[3] F. Benaych-Georges, R. Nadakuditi, The singular values and vectors of low rank perturbations of large rectangular random matrices. arXiv:1103.2221 (preprint). · Zbl 1252.15039
[4] F. Bunea, Y. She, M. Wegkamp, Adaptive rank penalized estimators in multivariate regression. arXiv:1004.2995 (preprint). · Zbl 1216.62086
[5] Candès, E.; Recht, B., Exact matrix completion via convex optimization, Foundations of Computational Mathematics, 9, 6, 717-772, (2009) · Zbl 1219.90124
[6] Candès, E.; Romberg, J.; Tao, T., Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory, 52, 2, 489-509, (2006) · Zbl 1231.94017
[7] Capitaine, M.; Donati-Martin, C.; Féral, D., The largest eigenvalue of finite rank deformation of large Wigner matrices: convergence and non-universality of the fluctuations, The Annals of Probability, 37, 1, 1-47, (2009) · Zbl 1163.15026
[8] Donoho, D., Compressed sensing, IEEE Transactions on Information Theory, 52, 4, 1289-1306, (2006) · Zbl 1288.94016
[9] Dozier, R.; Silverstein, J., On the empirical distribution of eigenvalues of large dimensional information-plus-noise-type matrices, Journal of Multivariate Analysis, 98, 4, 678-694, (2007) · Zbl 1115.60035
[10] El Karoui, N., Spectrum estimation for large dimensional covariance matrices using random matrix theory, The Annals of Statistics, 36, 6, 2757-2790, (2008) · Zbl 1168.62052
[11] Féral, D.; Péché, S., The largest eigenvalue of rank one deformation of large Wigner matrices, Communications in Mathematical Physics, 272, 1, 185-228, (2007) · Zbl 1136.82016
[12] Geman, S., A limit theorem for the norm of random matrices, The Annals of Probability, 8, 2, 252-261, (1980) · Zbl 0428.60039
[13] Györfi, L.; Vajda, I.; Van Der Meulen, E., Minimum Kolmogorov distance estimates of parameters and parametrized distributions, Metrika, 43, 1, 237-255, (1996) · Zbl 0855.62016
[14] Holter, N.; Mitra, M.; Maritan, A.; Cieplak, M.; Banavar, J.; Fedoroff, N., Fundamental patterns underlying gene expression profiles: simplicity from complexity, Proceedings of the National Academy of Sciences, 97, 15, 8409, (2000)
[15] Johnstone, I., On the distribution of the largest eigenvalue in principal components analysis, The Annals of Statistics, 29, 2, 295-327, (2001) · Zbl 1016.62078
[16] Konstantinides, K.; Natarajan, B.; Yovanof, G., Noise estimation and filtering using block-based singular value decomposition, IEEE Transactions on Image Processing, 6, 3, 479-483, (1997)
[17] Lee, S.; Zou, F.; Wright, F., Convergence and prediction of principal component scores in high-dimensional settings, Annals of Statistics, 38, 6, 3605, (2010) · Zbl 1204.62097
[18] Marčenko, V.; Pastur, L., Distribution of eigenvalues for some sets of random matrices, USSR Sbornik: Mathematics, 1, 4, 457-483, (1967) · Zbl 0162.22501
[19] Mirsky, L., Symmetric gauge functions and unitarily invariant norms, The Quarterly Journal of Mathematics, 11, 1, 50, (1960) · Zbl 0105.01101
[20] Nadakuditi, R.; Silverstein, J., Fundamental limit of sample eigenvalue based detection of signals in colored noise using relatively few samples, Signals, Systems and Computers, 686-690, (2007)
[21] Nadler, B., Finite sample approximation results for principal component analysis: a matrix perturbation approach, The Annals of Statistics, 36, 6, 2791-2817, (2008) · Zbl 1168.62058
[22] S. Negahban, M. Wainwright, Estimation of (near) low-rank matrices with noise and high-dimensional scaling. arXiv:0912.5100 (preprint). · Zbl 1216.62090
[23] Paul, D., Asymptotics of sample eigenstructure for a large dimensional spiked covariance model, Statistica Sinica, 17, 4, 1617, (2007) · Zbl 1134.62029
[24] S. Raychaudhuri, J. Stuart, R. Altman, Principal components analysis to summarize microarray experiments: application to sporulation time series, in: Pacific Symposium on Biocomputing, 2000, pp. 452-463.
[25] Rohde, A.; Tsybakov, A., Estimation of high-dimensional low-rank matrices, The Annals of Statistics, 39, 2, 887-930, (2011) · Zbl 1215.62056
[26] Stewart, G., Perturbation theory for the singular value decomposition, (SVD and Signal Processing, II: Algorithms, Analysis and Applications, (1991)), 99-109
[27] Troyanskaya, O.; Cantor, M.; Sherlock, G.; Brown, P.; Hastie, T.; Tibshirani, R.; Botstein, D.; Altman, R., Missing value estimation methods for DNA microarrays, Bioinformatics, 17, 6, 520, (2001)
[28] P. Vallet, P. Loubaton, X. Mestre, Improved subspace estimation for multivariate observations of high dimension: the deterministic signals case. arXiv:1002.3234 (preprint). · Zbl 1365.62123
[29] Wachter, K., The strong limits of random matrix spectra for sample matrices of independent elements, The Annals of Probability, 6, 1, 1-18, (1978) · Zbl 0374.60039
[30] Wall, M.; Dyck, P.; Brettin, T., SVDMAN—singular value decomposition analysis of microarray data, Bioinformatics, 17, 6, 566, (2001)
[31] Wedin, P., Perturbation bounds in connection with singular value decomposition, BIT Numerical Mathematics, 12, 1, 99-111, (1972) · Zbl 0239.15015
[32] Y. Wongsawat, K. Rao, S. Oraintara, Multichannel SVD-based image de-noising (2005) 5990-5993.
[33] Yata, K.; Aoshima, M., Effective PCA for high-dimension, low-sample-size data with noise reduction via geometric representations, Journal of Multivariate Analysis, 105, 1, 193-215, (2012) · Zbl 1236.62065
[34] Ziskind, I.; Wax, M., Maximum likelihood localization of multiple sources by alternating projection, Acoustics, Speech and Signal Processing, IEEE Transactions, 36, 10, 1553-1560, (1988) · Zbl 0850.93749
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.