×

Fixed-rank matrix factorizations and Riemannian low-rank optimization. (English) Zbl 1306.65107

Summary: Motivated by the problem of learning a linear regression model whose parameter is a large fixed-rank non-symmetric matrix, we consider the optimization of a smooth cost function defined on the set of fixed-rank matrices. We adopt the geometric framework of optimization on Riemannian quotient manifolds. We study the underlying geometries of several well-known fixed-rank matrix factorizations and then exploit the Riemannian quotient geometry of the search space in the design of a class of gradient descent and trust-region algorithms. The proposed algorithms generalize our previous results on fixed-rank symmetric positive semidefinite matrices, apply to a broad range of applications, scale to high-dimensional problems, and confer a geometric basis to recent contributions on the learning of fixed-rank non-symmetric matrices. We make connections with existing algorithms in the context of low-rank matrix completion and discuss the usefulness of the proposed framework. Numerical experiments suggest that the proposed algorithms compete with state-of-the-art algorithms and that manifold optimization offers an effective and versatile framework for the design of machine learning algorithms that learn a fixed-rank matrix.

MSC:

62-08 Computational methods for problems pertaining to statistics
15A83 Matrix completion problems
65K05 Numerical mathematical programming methods
65F20 Numerical solutions to overdetermined systems, pseudoinverses
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Abernethy J, Bach F, Evgeniou T, Vert JP (2009) A new approach to collaborative filtering: operator estimation with spectral regularization. J Mach Learn Res 10:803-826 · Zbl 1235.68122
[2] Absil PA, Amodei L, Meyer G (2012) Two Newton methods on the manifold of fixed-rank matrices endowed with Riemannian quotient geometries. Tech. Rep. UCL-INMA-2012.05, U.C.Louvain · Zbl 1306.65015
[3] Absil PA, Mahony R, Sepulchre R (2008) Optimization algorithms on matrix manifolds. Princeton University Press, Princeton · Zbl 1147.65043
[4] Amit Y, Fink M, Srebro N, Ullman S (2007) Uncovering shared structures in multiclass classification. In: Ghahramani Z (ed) Proceedings of the 24th international conference on machine learning, pp 17-24
[5] Baker CG, Absil PA, Gallivan KA (2007) GenRTR: the Generic Riemannian Trust-region package. http://www.math.fsu.edu/cbaker/genrtr/
[6] Bartels RH, Stewart GW (1972) Solution of the matrix equation \[\text{ ax }+\text{ xb }=\text{ c }\] ax+xb=c [f4] (algorithm 432). Commun ACM 15:820-826 · Zbl 1372.65121
[7] Bhatia R (2007) Positive definite matrices. Princeton University Press, Princeton · Zbl 1125.15300
[8] Bleakley K, Yamanishi Y (2009) Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics 25:2397-2403
[9] Bonnabel S, Sepulchre R (2009) Riemannian metric and geometric mean for positive semidefinite matrices of fixed rank. SIAM J Matrix Anal Appl 31:1055-1070 · Zbl 1220.47025
[10] Boumal N, Absil PA (2011) RTRMC: A Riemannian trust-region method for low-rank matrix completion. In: Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger K (eds) Neural information processing systems conference, NIPS, pp 406-414 · Zbl 1271.65083
[11] Boumal N, Absil PA (2012), Low-rank matrix completion via trust-regions on the Grassmann manifold. Tech. rep., UCL-INMA-2012.07 · Zbl 1312.90092
[12] Boumal N, Mishra B, Absil PA, Sepulchre R (2013), Manopt: a Matlab toolbox for optimization on manifolds. arXiv, preprint arXiv:13085200 [csMS] · Zbl 1319.90003
[13] Brand M (2006) Fast low-rank modifications of the thin singular value decomposition. Linear Algebra Appl 415:20-30 · Zbl 1088.65037
[14] Cai JF, Candès EJ, Shen Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20:1956-1982 · Zbl 1201.90155
[15] Cai D, He X, Han J (2007) Efficient kernel discriminant analysis via spectral regression. In: Proceedings of the IEEE international conference on data mining, ICDM, pp 427-432 · Zbl 1201.90155
[16] Candès EJ, Recht B (2008) Exact matrix completion via convex optimization. Found Comput Math 9:717-772 · Zbl 1219.90124
[17] Dai W, Milenkovic O, Kerman E (2011) Subspace evolution and transfer (SET) for low-rank matrix completion. IEEE Trans Signal Process 59:3120-3132 · Zbl 1392.94167
[18] Dai W, Kerman E, Milenkovic O (2012) A geometric approach to low-rank matrix completion. IEEE Trans Inf Theory 58:237-247 · Zbl 1365.15037
[19] Edelman A, Arias T, Smith S (1998) The geometry of algorithms with orthogonality constraints. SIAM J Matrix Anal Appl 20:303-353 · Zbl 0928.65050
[20] Evgeniou T, Micchelli C, Pontil M (2005) Learning multiple tasks with kernel methods. J Mach Learn Res 6:615-637 · Zbl 1222.68197
[21] Golub GH, Van Loan CF (1996) Matrix computations, 3rd edn. The Johns Hopkins University Press, 2715 North Charles Street, Baltimore, Maryland 21218-4319 · Zbl 1242.62069
[22] Gross D (2011) Recovering low-rank matrices from few coefficients in any basis. IEEE Trans Inf Theory 57:1548-1566 · Zbl 1366.94103
[23] Jain P, Meka R, Dhillon I (2010) Guaranteed rank minimization via singular value projection. In: Lafferty J, Williams CKI, Shawe-Taylor J, Zemel R, Culotta A (eds) Advances in neural information processing systems. NIPS 23, pp 937-945 · Zbl 1366.94103
[24] Jeffrey DJ (2010) LU factoring of non-invertible matrices. ACM Commun Comput Algebra 44:1-8 · Zbl 1322.68284
[25] Journée M (2009) Geometric algorithms for component analysis with a view to gene expression data analysis. PhD thesis, University of Liège, Liège, Belgium · JFM 28.0677.02
[26] Keshavan RH, Montanari A, Oh S (2010) Matrix completion from noisy entries. J Mach Learn Res 11:2057-2078 · Zbl 1242.62069
[27] Kulis B, Sustik M, Dhillon IS (2009) Low-rank kernel learning with Bregman matrix divergences. J Mach Learn Res 10:341-376 · Zbl 1235.68166
[28] Kulis B, Saenko K, Darrell T (2011) What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In: Proceedings of the IEEE conference on computer vision and pattern recognition, CVPR, pp 1785-1792 · Zbl 1392.94167
[29] Larsen R (1998) Lanczos bidiagonalization with partial reorthogonalization. Technical Report DAIMI PB-357, Department of Computer Science, Aarhus University · Zbl 1219.90124
[30] Lee JM (2003) Introduction to smooth manifolds, graduate texts in mathematics, vol 218, 2nd edn. Springer, New York
[31] Lee K, Bresler Y (2010) Admira: atomic decomposition for minimum rank approximation. IEEE Trans Inf Theory 56:4402-4416 · Zbl 1366.94112
[32] Mazumder R, Hastie T, Tibshirani R (2010) Spectral regularization algorithms for learning large incomplete matrices. J Mach Learn Res 11:2287-2322 · Zbl 1242.68237
[33] Meka R, Jain P, Dhillon IS (2009) Matrix completion from power-law distributed samples. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22, NIPS, pp 1258-1266 · Zbl 1222.68197
[34] Meyer G (2011) Geometric optimization algorithms for linear regression on fixed-rank matrices. PhD thesis, University of Liège, Liège, Belgium
[35] Meyer G, Bonnabel S, Sepulchre R (2011b) Regression on fixed-rank positive semidefinite matrices: a Riemannian approach. J Mach Learn Res 11:593-625 · Zbl 1280.68185
[36] Meyer G, Bonnabel S, Sepulchre R (2011a) Linear regression under fixed-rank constraints: a Riemannian approach. In: Proceedings of the 28th international conference on machine learning, ICML, pp 545-552 · Zbl 1280.68185
[37] Mishra B, Adithya Apuroop K, Sepulchre R (2012) A Riemannian geometry for low-rank matrix completion. Tech. rep., arXiv:1211.1550 · Zbl 1147.65043
[38] Mishra B, Meyer G, Bach F, Sepulchre R (2011a) Low-rank optimization with trace norm penalty. Tech. rep., arXiv:1112.2318 · Zbl 1286.65078
[39] Mishra B, Meyer G, Sepulchre R (2011b) Low-rank optimization for distance matrix completion. In: Proceedings of the 50th IEEE conference on decision and control, Orlando (USA), pp 4455-4460
[40] Netflix (2006) The Netflix prize. http://www.netflixprize.com/
[41] Ngo TT, Saad Y (2012) Scaled gradients on Grassmann manifolds for matrix completion. In: Advances in neural information processing systems, NIPS, pp 1421-1429
[42] Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer, New York · Zbl 1104.65059
[43] Piziak R, Odell PL (1999) Full rank factorization of matrices. Math Mag 72:193-201 · Zbl 1006.15009
[44] Rennie J, Srebro N (2005), Fast maximum margin matrix factorization for collaborative prediction. In: Proceedings of the 22nd international conference on machine learning, pp 713-719
[45] Shalit U, Weinshall D, Chechik G (2010) Online learning in the manifold of low-rank matrices. In: Lafferty J, Williams CKI, Shawe-Taylor J, Zemel R, Culotta A (eds) Advances in neural information processing systems 23, pp 2128-2136 · Zbl 1283.68302
[46] Simonsson L, Eldén L (2010) Grassmann algorithms for low rank approximation of matrices with missing values. BIT Numer Math 50:173-191 · Zbl 1186.65050
[47] Vandereycken B (2013) Low-rank matrix completion by Riemannian optimization. SIAM J Optim 23:1214-1236 · Zbl 1277.15021
[48] Wen Z, Yin W, Zhang Y (2012) Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math Program Comput 4:333-361 · Zbl 1271.65083
[49] Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M (2008) Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 24:i232
[50] Yuan M, Ekici A, Lu Z, Monteiro R (2007) Dimension reduction and coefficient estimation in multivariate linear regression. J R Stat Soc 69:329-346 · Zbl 07555355
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.