Nonconvex robust low-rank matrix recovery.

*(English)*Zbl 07175265Summary: In this paper, we study the problem of recovering a low-rank matrix from a number of random linear measurements that are corrupted by outliers taking arbitrary values. We consider a nonsmooth nonconvex formulation of the problem, in which we explicitly enforce the low-rank property of the solution by using a factored representation of the matrix variable and employ an \(\ell_1\)-loss function to robustify the solution against outliers. We show that even when a constant fraction (which can be up to almost half) of the measurements are arbitrarily corrupted, as long as certain measurement operators arising from the measurement model satisfy the so-called \(\ell_1/\ell_2\)-restricted isometry property, the ground-truth matrix can be exactly recovered from any global minimum of the resulting optimization problem. Furthermore, we show that the objective function of the optimization problem is sharp and weakly convex. Consequently, a subgradient method (SubGM) with geometrically diminishing step sizes will converge linearly to the ground-truth matrix when suitably initialized. We demonstrate the efficacy of the SubGM for the nonconvex robust low-rank matrix recovery problem with various numerical experiments.

Reviewer: Reviewer (Berlin)

##### MSC:

65K10 | Numerical optimization and variational techniques |

90C26 | Nonconvex programming, global optimization |

68Q25 | Analysis of algorithms and problem complexity |

68W40 | Analysis of algorithms |

62B10 | Statistical aspects of information-theoretic topics |

##### Software:

SDPLR
Full Text:
DOI

##### References:

[1] | S. Aaronson, The learnability of quantum states, Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. , 463 (2007), pp. 3089-3114. · Zbl 1140.81317 |

[2] | Y. Bai, Q. Jiang, and J. Sun, Subgradient descent learns orthogonal dictionaries, in Proceedings of the International Conference on Learning Representations (ICLR), 2019. |

[3] | D. P. Bertsekas, Incremental gradient, subgradient, and proximal methods for convex optimization, in Optimization for Machine Learning, S. Sra, S. Nowozin, and S. J. Wright, eds., MIT Press, Cambridge, MA, 2012, pp. 85-119. |

[4] | S. Bhojanapalli, B. Neyshabur, and N. Srebro, Global optimality of local search for low rank matrix recovery, in Advances in Neural Information Processing Systems, Vol. 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, eds., MIT Press, Cambridge, MA, 2016, pp. 3873-3881. |

[5] | S. Burer and R. D. Monteiro, A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization, Math. Program., 95 (2003), pp. 329-357. · Zbl 1030.90077 |

[6] | S. Burer and R. D. Monteiro, Local minima and convergence in low-rank semidefinite programming, Math. Program., 103 (2005), pp. 427-444. · Zbl 1099.90040 |

[7] | J. V. Burke and M. C. Ferris, Weak sharp minima in mathematical programming, SIAM J. Control Optim., 31 (1993), pp. 1340-1359, https://doi.org/10.1137/0331063. · Zbl 0791.90040 |

[8] | E. J. Candès, X. Li, Y. Ma, and J. Wright, Robust principal component analysis?, J. ACM, 58 (2011), 11. · Zbl 1327.62369 |

[9] | E. J. Candès and Y. Plan, Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements, IEEE Trans. Inform. Theory, 57 (2011), pp. 2342-2359. · Zbl 1366.90160 |

[10] | E. J. Candès and B. Recht, Exact matrix completion via convex optimization, Found. Comput. Math., 9 (2009), pp. 717-772. · Zbl 1219.90124 |

[11] | Y. Chen, Y. Chi, and A. J. Goldsmith, Exact and stable covariance estimation from quadratic sampling via convex programming, IEEE Trans. Inform. Theory, 61 (2015), pp. 4034-4059. · Zbl 1359.62181 |

[12] | Y. Chi, Y. M. Lu, and Y. Chen, Nonconvex optimization meets low-rank matrix factorization: An overview, IEEE Trans. Signal Process., 67 (2019), pp. 5239-5269. · Zbl 07123429 |

[13] | M. A. Davenport and J. Romberg, An overview of low-rank matrix recovery from incomplete observations, IEEE J. Sel. Topics Signal Process., 10 (2016), pp. 608-622. |

[14] | D. Davis, D. Drusvyatskiy, K. J. MacPhee, and C. Paquette, Subgradient methods for sharp weakly convex functions, J. Optim. Theory Appl., 179 (2018), pp. 962-982. · Zbl 06985502 |

[15] | D. Davis, D. Drusvyatskiy, and C. Paquette, The nonsmooth landscape of phase retrieval, IMA J. Numer. Anal., (2020), https://doi.org/10.1093/imanum/drz031. |

[16] | F. De La Torre and M. J. Black, A framework for robust subspace learning, Int. J. Comput. Vis., 54 (2003), pp. 117-142. · Zbl 1076.68058 |

[17] | J. C. Duchi and F. Ruan, Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval, Inf. Inference, 8 (2019), pp. 471-529, https://doi.org/10.1093/imaiai/iay015. |

[18] | M. Fazel, H. Hindi, and S. Boyd, Rank minimization and applications in system theory, in Proceedings of the 2004 IEEE American Control Conference, Vol. 4, 2004, pp. 3273-3278. |

[19] | R. Ge, J. D. Lee, and T. Ma, Matrix completion has no spurious local minima, in Advances in Neural Information Processing Systems, Vol. 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, eds., MIT Press, Cambridge, MA, 2016, pp. 2973-2981. |

[20] | J.-L. Goffin, On convergence rates of subgradient optimization methods, Math. Program., 13 (1977), pp. 329-347. · Zbl 0368.90119 |

[21] | D. Gross, Recovering low-rank matrices from few coefficients in any basis, IEEE Trans. Inform. Theory, 57 (2011), pp. 1548-1566. · Zbl 1366.94103 |

[22] | Q. Gu, Z. W. Wang, and H. Liu, Low-rank and sparse structure pursuit via alternating minimization, in Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS 2016), 2016, pp. 600-609. |

[23] | B. Haeffele, E. Young, and R. Vidal, Structured low-rank matrix factorization: Optimality, algorithm, and applications to image processing, in Proceedings of the 31st International Conference on Machine Learning (ICML 2014), 2014, pp. 2007-2015. |

[24] | C. Josz, Y. Ouyang, R. Zhang, J. Lavaei, and S. Sojoudi, A theory on the absence of spurious solutions for nonconvex and nonsmooth optimization, in Advances in Neural Information Processing Systems, Vol. 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds., MIT Press, Cambridge, MA, 2018, pp. 2441-2449. |

[25] | Q. Ke and T. Kanade, Robust \(L_1\) norm factorization in the presence of outliers and missing data by alternative convex programming, in Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), Vol. 1, 2005, pp. 739-746. |

[26] | L. Li, W. Huang, I. Y.-H. Gu, and Q. Tian, Statistical modeling of complex backgrounds for foreground object detection, IEEE Trans. Image Process., 13 (2004), pp. 1459-1472. |

[27] | Q. Li, Z. Zhu, and G. Tang, The non-convex geometry of low-rank matrix optimization, Inf. Inference, 8 (2019), pp. 51-96, https://doi.org/10.1093/imaiai/iay003. · Zbl 07127822 |

[28] | X. Li, J. Lu, R. Arora, J. Haupt, H. Liu, Z. Wang, and T. Zhao, Symmetry, saddle points, and global optimization landscape of nonconvex matrix factorization, IEEE Trans. Inform. Theory, 65 (2019), pp. 3489-3514. · Zbl 1432.90123 |

[29] | X. Li, Z. Zhu, A. M.-C. So, and R. Vidal, Nonconvex Robust Low-Rank Matrix Recovery, companion technical report of this paper; available online from https://arxiv.org/abs/1809.09237, 2018. |

[30] | Y. Li, Y. Chi, H. Zhang, and Y. Liang, Nonconvex low-rank matrix recovery with arbitrary outliers via median-truncated gradient descent, Inf. Inference, (2019), https://doi.org/10.1093/imaiai/iaz009. |

[31] | Y. Li, Y. Sun, and Y. Chi, Low-rank positive semidefinite matrix recovery from corrupted rank-one measurements, IEEE Trans. Signal Process., 65 (2017), pp. 397-408. · Zbl 1414.94385 |

[32] | A. Nedić and D. Bertsekas, Convergence rate of incremental subgradient algorithms, in Stochastic Optimization: Algorithms and Applications, Appl. Optim. 54, S. Uryasev and P. M. Pardalos, eds., Kluwer Academic Publishers, Dordrecht, The Netherlands, 2001. · Zbl 0984.90033 |

[33] | P. Netrapalli, U. N. Niranjan, S. Sanghavi, A. Anandkumar, and P. Jain, Non-convex robust PCA, in Advances in Neural Information Processing Systems, Vol. 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, eds., MIT Press, Cambridge, MA, 2014, pp. 1107-1115. |

[34] | J. Nocedal and S. Wright, Numerical Optimization, Springer, New York, 2006. · Zbl 1104.65059 |

[35] | D. Park, A. Kyrillidis, C. Caramanis, and S. Sanghavi, Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach, in Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017), 2017, pp. 65-74. |

[36] | B. T. Polyak, Minimization of unsmooth functions, USSR Comput. Math. Math. Phys, 9 (1969), pp. 14-29. · Zbl 0229.65056 |

[37] | R. T. Rockafellar and R. J.-B. Wets, Variational Analysis, 2nd ed., Grundlehren Math. Wiss. 317, Springer-Verlag, Berlin, Heidelberg, 2004. · Zbl 0888.49001 |

[38] | N. Z. Shor, Minimization Methods for Non-differentiable Functions, Springer Ser. Comput. Math. 3, Springer-Verlag, Berlin, Heidelberg, 1985. · Zbl 0561.90058 |

[39] | N. Z. Shor and M. B. Shchepakin, Algorithms for the solution of the two-stage problem in stochastic programming, Kibernetika, 4 (1968), pp. 56-58. · Zbl 0193.19203 |

[40] | N. Srebro, J. Rennie, and T. S. Jaakkola, Maximum-margin matrix factorization, in Advances in Neural Information Processing Systems, Vol. 17, L. K. Saul, Y. Weiss, and L. Bottou, eds., MIT Press, Cambridge, MA, 2004, pp. 1329-1336. |

[41] | R. Sun and Z.-Q. Luo, Guaranteed matrix completion via non-convex factorization, IEEE Trans. Inform. Theory, 62 (2016), pp. 6535-6579. · Zbl 1359.94179 |

[42] | S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, and B. Recht, Low-rank solutions of linear matrix equations via Procrustes flow, in Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), 2016, pp. 964-973. |

[43] | R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, in Compressed Sensing: Theory and Applications, Y. C. Eldar and G. Kutyniok, eds., Cambridge University Press, New York, 2012, pp. 210-268. |

[44] | J.-P. Vial, Strong and weak convexity of sets and functions, Math. Oper. Res., 8 (1983), pp. 231-259. · Zbl 0526.90077 |

[45] | X. Yi, D. Park, Y. Chen, and C. Caramanis, Fast algorithms for robust PCA via gradient descent, in Advances in Neural Information Processing Systems, Vol. 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, eds., MIT Press, Cambridge, MA, 2016, pp. 4152-4160. |

[46] | M.-C. Yue and A. M.-C. So, A perturbation inequality for concave functions of singular values and its applications in low-rank matrix recovery, Appl. Comput. Harmon. Anal., 40 (2016), pp. 396-416. · Zbl 1335.15031 |

[47] | M.-C. Yue, Z. Zhou, and A. M.-C. So, On the quadratic convergence of the cubic regularization method under a local error bound condition, SIAM J. Optim., 29 (2019), pp. 904-932, https://doi.org/10.1137/18M1167498. · Zbl 1411.90284 |

[48] | M. Zhang, Z.-H. Huang, and Y. Zhang, Restricted \(p\)-isometry properties of nonconvex matrix recovery, IEEE Trans. Inform. Theory, 59 (2013), pp. 4316-4323. · Zbl 1364.94179 |

[49] | X. Zhang, L. Wang, and Q. Gu, A unified framework for nonconvex low-rank plus sparse matrix recovery, in Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018), 2018, pp. 1097-1107. |

[50] | Q. Zheng and J. Lafferty, A convergent gradient descent algorithm for rank minimization and semidefinite programming from random linear measurements, in Advances in Neural Information Processing Systems, Vol. 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, eds., MIT Press, Cambridge, MA, 2015, pp. 109-117. |

[51] | Z. Zhu, Q. Li, G. Tang, and M. B. Wakin, The Global Optimization Geometry of Low-Rank Matrix Optimization, preprint, https://arxiv.org/abs/1703.01256, 2017. · Zbl 1414.90297 |

[52] | Z. Zhu, Q. Li, G. Tang, and M. B. Wakin, Global optimality in low-rank matrix optimization, IEEE Trans. Signal Process., 66 (2018), pp. 3614-3628. · Zbl 1414.90297 |

[53] | Z. Zhu, A. M.-C. So, and Y. Ye, Fast and near-optimal matrix completion via randomized basis pursuit, in Fifth International Congress of Chinese Mathematicians, L. Ji, Y. S. Poon, L. Yang, and S.-T. Yau, eds., AMS/IP Stud. Adv. Math. 51, Part 2, American Mathematical Society/International Press, Providence, RI, 2012, pp. 859-882. · Zbl 1269.15031 |

[54] | Z. Zhu, Y. Wang, D. Robinson, D. Naiman, R. Vidal, and M. Tsakiris, Dual principal component pursuit: Improved analysis and efficient algorithms, in Advances in Neural Information Processing Systems, Vol. 31, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, eds., MIT Press, Cambridge, MA, 2018, pp. 2171-2181. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.