zbMATH — the first resource for mathematics

The non-convex geometry of low-rank matrix optimization. (English) Zbl 07127822
Summary: This work considers two popular minimization problems: (i) the minimization of a general convex function \(f(X)\) with the domain being positive semi-definite matrices, and (ii) the minimization of a general convex function \(f(X)\) regularized by the matrix nuclear norm \(\|X\|_*\) with the domain being general matrices. Despite their optimal statistical performance in the literature, these two optimization problems have a high computational complexity even when solved using tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer and Monteiro to factor the low-rank variable \(X = UU^{\top } \) (for semi-definite matrices) or \(X=UV^{\top } \) (for general matrices) and also replace the nuclear norm \(\|X\|_*\) with \(\big(\|U\|_F^2+\|V\|_F^2\big)/2\). In spite of the non-convexity of the resulting factored formulations, we prove that each critical point either corresponds to the global optimum of the original convex problems or is a strict saddle where the Hessian matrix has a strictly negative eigenvalue. Such a nice geometric structure of the factored formulations allows many local-search algorithms to find a global optimizer even with random initializations.

94-XX Information and communication theory, circuits
62-XX Statistics
Full Text: DOI