×

Matrix completion with covariate information. (English) Zbl 1478.62122

Summary: This article investigates the problem of matrix completion from the corrupted data, when the additional covariates are available. Despite being seldomly considered in the matrix completion literature, these covariates often provide valuable information for completing the unobserved entries of the high-dimensional target matrix \(\mathbf{A}_{0}\). Given a covariate matrix \(\mathbf{X}\) with its rows representing the row covariates of \(\mathbf{A}_{0}\), we consider a column-space-decomposition model \(\mathbf{A}_{0} = \mathbf{X}\boldsymbol{\beta}_{0} + \mathbf{B}_{0}\), where \(\boldsymbol{\beta}_{0}\) is a coefficient matrix and \(\mathbf{B}_{0}\) is a low-rank matrix orthogonal to \(\mathbf{X}\) in terms of column space. This model facilitates a clear separation between the interpretable covariate effects \((\mathbf{X}\boldsymbol{\beta}_{0})\) and the flexible hidden factor effects \((\mathbf{B}_{0})\). Besides, our work allows the probabilities of observation to depend on the covariate matrix, and hence a missing-at-random mechanism is permitted. We propose a novel penalized estimator for \(\mathbf{A}_{0}\) by utilizing both Frobenius-norm and nuclear-norm regularizations with an efficient and scalable algorithm. Asymptotic convergence rates of the proposed estimators are studied. The empirical performance of the proposed methodology is illustrated via both numerical experiments and a real data application.

MSC:

62H12 Estimation in multivariate analysis
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abernethy, J.; Bach, F.; Evgeniou, T.; Vert, J.-P., A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization, Journal of Machine Learning Research, 10, 803-826 (2009) · Zbl 1235.68122
[2] Bi, X.; Qu, A.; Wang, J.; Shen, X., A Group-Specific Recommender System,, Journal of the American Statistical Association, 112, 1344-1353 (2016)
[3] Cai, T.; Cai, T. T.; Zhang, A., Structured Matrix Completion with Applications to Genomic Data Integration,, Journal of the American Statistical Association, 111, 621-633 (2016)
[4] Cai, T. T.; Zhou, W.-X., Matrix Completion via Max-Norm Constrained Optimization, Electronic Journal of Statistics, 10, 1493-1525 (2016) · Zbl 1342.62091
[5] Candès, E. J.; Plan, Y., Matrix Completion with Noise, Proceedings of the IEEE, 98, 925-936 (2010)
[6] Candès, E. J.; Recht, B., Exact Matrix Completion via Convex Optimization, Foundations of Computational Mathematics, 9, 717-772 (2009) · Zbl 1219.90124
[7] Chiang, K.-Y., Hsieh, C.-J., and Dhillon, I. S. (2015), “Matrix Completion with Noisy Side Information,” Advances in Neural Information Processing Systems, 28, 3447-3455.
[8] Feuerverger, A.; He, Y.; Khatri, S., Statistical Significance of the Netflix Challenge, Statistical Science, 27, 202-231 (2012) · Zbl 1330.62090
[9] Freedman, D. A., Statistical Models: Theory and Practice (2009), New York: Cambridge University Press, New York
[10] Friedman, J.; Hastie, T.; Tibshirani, R., The Elements of Statistical Learning (2013), New York: Springer, New York
[11] Gross, D., Recovering Low-Rank Matrices from Few Coefficients in any Basis, IEEE Transactions on Information Theory, 57, 1548-1566 (2011) · Zbl 1366.94103
[12] Halko, N.; Martinsson, P.-G.; Tropp, J. A., Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions, SIAM Review, 53, 217-288 (2011) · Zbl 1269.65043
[13] Harper, F. M., and Konstan, J. A. (2016), “The MovieLens Datasets: History and Context,” ACM Transactions on Interactive Intelligent Systems (TiiS), 5, 19:1-19:19.
[14] Keshavan, R. H., Montanari, A., and Oh, S. (2009), “Matrix Completion from Noisy Entries,” Advances in Neural Information Processing Systems, 22, 952-960.
[15] Klopp, O., Noisy Low-Rank Matrix Completion with General Sampling Distribution, Bernoulli, 20, 282-303 (2014) · Zbl 1400.62115
[16] Koltchinskii, V.; Lounici, K.; Tsybakov, A. B., Nuclear-Norm Penalization and Optimal Rates for Noisy Low-Rank Matrix Completion, The Annals of Statistics, 39, 2302-2329 (2011) · Zbl 1231.62097
[17] Little, R. J.; Rubin, D. B., Statistical Analysis With Missing Data (2014), Hoboken NJ: Wiley, Hoboken NJ
[18] Ma, S.; Goldfarb, D.; Chen, L., Fixed Point and Bregman Iterative Methods for Matrix Rank Minimization, Mathematical Programming, 128, 321-353 (2011) · Zbl 1221.65146
[19] Mardia, K. V.; Kent, J. T.; Bibby, J. M., Multivariate Analysis (1980), London: Academic Press, London · Zbl 0432.62029
[20] Mazumder, R.; Hastie, T.; Tibshirani, R., Spectral Regularization Algorithms for Learning Large Incomplete Matrices, Journal of Machine Learning Research, 11, 2287-2322 (2010) · Zbl 1242.68237
[21] Natarajan, N.; Dhillon, I. S., Inductive Matrix Completion for Predicting Gene-Disease Associations, Bioinformatics, 30, i60-i68 (2014)
[22] Negahban, S.; Wainwright, M. J., Restricted Strong Convexity and Weighted Matrix Completion: Optimal Bounds with Noise, Journal of Machine Learning Research, 13, 1665-1697 (2012) · Zbl 1436.62204
[23] Oh, T.-H., Matsushita, Y., Tai, Y.-W., and Kweon, I. S. (2015), “Fast Randomized Singular Value Thresholding for Nuclear Norm Minimization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4484-4493.
[24] Recht, B., A Simpler Approach to Matrix Completion, Journal of Machine Learning Research, 12, 3413-3430 (2011) · Zbl 1280.68141
[25] Rohde, A.; Tsybakov, A. B., Estimation of High-Dimensional Low-Rank Matrices, The Annals of Statistics, 39, 887-930 (2011) · Zbl 1215.62056
[26] Srebro, N., and Salakhutdinov, R. R. (2010), “Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm,” Advances in Neural Information Processing Systems, 23, 2056-2064.
[27] Sun, T., and Zhang, C.-H. (2012), “Calibrated Elastic Regularization in Matrix Completion,” Advances in Neural Information Processing Systems, 25, 863-871.
[28] Sweeting, T., Uniform Asymptotic Normality of the Maximum Likelihood Estimator, The Annals of Statistics, 8, 1375-1381 (1980) · Zbl 0447.62041
[29] Troyanskaya, O.; Cantor, M.; Sherlock, G.; Brown, P.; Hastie, T.; Tibshirani, R.; Botstein, D.; Altman, R. B., Missing Value Estimation Methods for DNA Microarrays, Bioinformatics, 17, 520-525 (2001)
[30] Xu, M., Jin, R., and Zhou, Z.-H. (2013), “Speedup Matrix Completion with Side Information: Application to Multi-Label Learning,” Advances in Neural Information Processing Systems, 26, 2301-2309.
[31] Zhu, Y.; Shen, X.; Ye, C., Personalized Prediction and Sparsity Pursuit in Latent Factor Models, Journal of the American Statistical Association, 111, 241-252 (2016)
[32] Zou, H., and Hastie, T. (2005), “Regularization and Variable Selection via the Elastic Net,” Journal of the Royal Statistical Society, Series B, 67, 301-320. · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.