×

Critical behavior and universality classes for an algorithmic phase transition in sparse reconstruction. (English) Zbl 1416.62395

Summary: Recovery of an \(N\)-dimensional, \(K\)-sparse solution \({\mathbf{x}}\) from an \(M\)-dimensional vector of measurements \({\mathbf{y}}\) for multivariate linear regression can be accomplished by minimizing a suitably penalized least-mean-square cost \(\Vert{\mathbf{y}}-{\mathbf{H}} {\mathbf{x}}\Vert_2^2+\lambda V({\mathbf{x}})\). Here \({\mathbf{H}}\) is a known matrix and \(V({\mathbf{x}})\) is an algorithm-dependent sparsity-inducing penalty. For ‘random’ \({\mathbf{H}}\), in the limit \(\lambda \rightarrow 0\) and \(M,N,K\rightarrow \infty \), keeping \(\rho =K/N\) and \(\alpha =M/N\) fixed, exact recovery is possible for \(\alpha \) past a critical value \(\alpha _c = \alpha (\rho )\). Assuming \({\mathbf{x}}\) has iid entries, the critical curve exhibits some universality, in that its shape does not depend on the distribution of \({\mathbf{x}}\). However, the algorithmic phase transition occurring at \(\alpha =\alpha _c\) and associated universality classes remain ill-understood from a statistical physics perspective, i.e. in terms of scaling exponents near the critical curve. In this article, we analyze the mean-field equations for two algorithms, Basis Pursuit (\(V({\mathbf{x}})=\Vert{\mathbf{x}}\Vert_{1} \)) and Elastic Net (\(V({\mathbf{x}})= \Vert{\mathbf{x}}\Vert_{1} + \tfrac{g}{2} \Vert{\mathbf{x}}\Vert_{2}^2\)) and show that they belong to different universality classes in the sense of scaling exponents, with mean squared error (MSE) of the recovered vector scaling as \(\lambda ^\frac{4}{3}\) and \(\lambda \) respectively, for small \(\lambda \) on the critical line. In the presence of additive noise, we find that, when \(\alpha >\alpha _c\), MSE is minimized at a non-zero value for \(\lambda \), whereas at \(\alpha =\alpha _c\), MSE always increases with \(\lambda \).

MSC:

62J05 Linear regression; mixed models
60B20 Random matrices (probabilistic aspects)
62J07 Ridge regression; shrinkage estimators (Lasso)
62H12 Estimation in multivariate analysis

Software:

PDCO; CVXOPT
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Amelunxen, D., Lotz, M., McCoy, M.B., Tropp, J.A.: Living on the edge: phase transitions in convex programs with random data. Inf. Inference 3(3), 224-294 (2014) · Zbl 1339.90251 · doi:10.1093/imaiai/iau005
[2] Andersen, M., Dahl, J., Vandenberghe, L.: CVXOPT: a python package for convex optimization (2010)
[3] Bayati, M., Montanari, A.: The dynamics of message passing on dense graphs, with applications to compressed sensing. IEEE Trans. Inf. Theory 57(2), 764-785 (2011) · Zbl 1366.94079 · doi:10.1109/TIT.2010.2094817
[4] Bayati, M., Montanari, A.: The lasso risk for gaussian matrices. IEEE Trans. Inf. Theory 58(4), 1997-2017 (2012) · Zbl 1365.62196 · doi:10.1109/TIT.2011.2174612
[5] Candès, E., Romberg, J.: Sparsity and incoherence in compressive sampling. Inverse Probl. 23(3), 969 (2007) · Zbl 1120.94005 · doi:10.1088/0266-5611/23/3/008
[6] Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489-509 (2006) · Zbl 1231.94017 · doi:10.1109/TIT.2005.862083
[7] Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33-61 (1998) · Zbl 0919.94002 · doi:10.1137/S1064827596304010
[8] Donoho, D.L.: For most large underdetermined systems of linear equations the minimal. Commun. Pure Appl. Math. 59(6), 797-829 (2006) · Zbl 1113.15004 · doi:10.1002/cpa.20132
[9] Donoho, D.L., Tanner, J.: Sparse nonnegative solution of underdetermined linear equations by linear programming. Proc. Natl. Acad. Sci. USA 102(27), 9446-9451 (2005) · Zbl 1135.90368 · doi:10.1073/pnas.0502269102
[10] Donoho, D., Tanner, J.: Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing. Philos. Trans. R. Soc. A 367(1906), 4273-4293 (2009). https://doi.org/10.1098/rsta.2009.0152 · Zbl 1185.94029 · doi:10.1098/rsta.2009.0152
[11] Donoho, D.L., Maleki, A., Montanari, A.: Message-passing algorithms for compressed sensing. Proc. Natl. Acad. Sci. USA 106(45), 18914-18919 (2009) · doi:10.1073/pnas.0909892106
[12] El Karoui, N.: On the impact of predictor geometry on the performance on high-dimensional ridge-regularized generalized robust regression estimators. Probab. Theory Relat. Fields 170(1-2), 95-175 (2018) · Zbl 1407.62060 · doi:10.1007/s00440-016-0754-9
[13] Ganguli, S., Sompolinsky, H.: Statistical mechanics of compressed sensing. Phys. Rev. Lett. 104(18), 188701 (2010) · doi:10.1103/PhysRevLett.104.188701
[14] Kabashima, Y., Wadayama, T., Tanaka, T.: A typical reconstruction limit for compressed sensing based on \[\ell_p\] ℓp-norm minimization. J. Stat. Mech. 2009(09), L09003 (2009) · doi:10.1088/1742-5468/2009/09/L09003
[15] Karoui, N.E.: Asymptotic behavior of unregularized and ridge-regularized high-dimensional robust regression estimators: rigorous results. arXiv preprint arXiv:1311.2445 (2013)
[16] Kubo, R.: The fluctuation-dissipation theorem. Rep. Prog. Phys. 29(1), 255 (1966) · Zbl 0163.23102 · doi:10.1088/0034-4885/29/1/306
[17] Ma, S.K.: Modern Theory of Critical Phenomena. Routledge, London (2018) · doi:10.4324/9780429498886
[18] Mézard, M., Parisi, G., Virasoro, M.: Sk model: the replica solution without replicas. Europhys. Lett. 1(2), 77-82 (1986) · doi:10.1209/0295-5075/1/2/006
[19] Mézard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond, vol. 9. World scientific, Singapore (1987) · Zbl 0992.82500
[20] Ramezanali, M., Mitra, P.P., Sengupta, A.M.: The cavity method for analysis of large-scale penalized regression. arXiv preprint arXiv:1501.03194 (2015)
[21] Rudelson, M., Vershynin, R.: On sparse reconstruction from fourier and gaussian measurements. Commun. Pure Appl. Math. 61(8), 1025-1045 (2008) · Zbl 1149.94010 · doi:10.1002/cpa.20227
[22] Stojnic, M.: Various thresholds for \[\ell_1\] ℓ1-optimization in compressed sensing. arXiv preprint arXiv:0907.3666 (2009)
[23] Stojnic, M.: \[ \ell_2/\ell_1\] ℓ2/ℓ1-optimization in block-sparse compressed sensing and its strong thresholds. IEEE J. Sel. Top. Signal Process. 4(2), 350-357 (2010) · doi:10.1109/JSTSP.2009.2039172
[24] Stojnic, M.: A rigorous geometry-probability equivalence in characterization of \[\ell_1\] ℓ1-optimization. arXiv preprint arXiv:1303.7287 (2013)
[25] Tibshirani, R., Wainwright, M., Hastie, T.: Statistical Learning with Sparsity: The Lasso and Generalizations. Chapman and Hall/CRC, New York (2015) · Zbl 1319.68003
[26] Tikhonov, A.N.: On the stability of inverse problems. Dokl. Akad. Nauk SSSR 39, 195-198 (1943)
[27] Vershik, A.M., Sporyshev, P.: Asymptotic behavior of the number of faces of random polyhedra and the neighborliness problem. Selecta Math. Soviet 11(2), 181-201 (1992) · Zbl 0791.52011
[28] Xu, Y., Kabashima, Y.: Statistical mechanics approach to 1-bit compressed sensing. J. Stat. Mech. 2013(2), P02041 (2013) · Zbl 1456.94021 · doi:10.1088/1742-5468/2013/02/P02041
[29] Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. 67(2), 301-320 (2005) · Zbl 1069.62054 · doi:10.1111/j.1467-9868.2005.00503.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.