×

Performance of first- and second-order methods for \(\ell_1\)-regularized least squares problems. (English) Zbl 1357.90107

Summary: We study the performance of first- and second-order optimization methods for \(\ell_1\)-regularized sparse least-squares problems as the conditioning of the problem changes and the dimensions of the problem increase up to one trillion. A rigorously defined generator is presented which allows control of the dimensions, the conditioning and the sparsity of the problem. The generator has very low memory requirements and scales well with the dimensions of the problem.

MSC:

90C20 Quadratic programming
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with sparsity-inducing penalties. J. Found. Trends. Mach. Learn. 4(1), 1-106 (2012) · Zbl 06064248 · doi:10.1561/2200000015
[2] Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183-202 (2009) · Zbl 1175.94009 · doi:10.1137/080716542
[3] Becker, S.: CoSaMP and OMP for sparse recovery. http://www.mathworks.co.uk/matlabcentral/fileexchange/32402-cosamp-and-omp-for-sparse-recovery (2012) · Zbl 1187.68665
[4] Becker, S.; Fadili, J.; Pereira, F. (ed.); Burges, CJC (ed.); Bottou, L. (ed.); Weinberger, KQ (ed.), A quasi-Newton proximal splitting method., 2618-2626 (2012), Red Hook
[5] Becker, S.R., Bobin, J., Candès, E.J.: Nesta: A fast and accurate first-order method for sparse recovery. SIAM J. Imaging Sci. 4(1), 1-39 (2011) · Zbl 1209.90265 · doi:10.1137/090756855
[6] Becker, S.R., Candés, E.J., Grant, M.C.: Templates for convex cone problems with applications to sparse signal recovery. Math. Program. Comput. 3(3), 165-218 (2011). http://tfocs.stanford.edu · Zbl 1257.90042
[7] Bertalmio, M., Sapiro, G., Ballester, C., Caselles, V.: Image inpainting. Proceedings of the 27th annual conference on computer graphics and interactive techniques (SIGGRAPH), pp. 417-424 (2000) · Zbl 1280.62081
[8] Bertero, M., Ruggiero, V., Zanni, L.: Special issue: Imaging 2013. Comput. Optim. Appl. 54, 211-213 (2013) · Zbl 06154014 · doi:10.1007/s10589-013-9536-9
[9] Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. J. Found. Trends. Mach. Learn. 3(1), 1-122 (2011) · Zbl 1229.90122 · doi:10.1561/2200000016
[10] Byrd, R.H., Nocedal, J., Oztoprak, F.: An inexact successive quadratic approximation method for convex l-1 regularized optimization. Math. Program. Ser B. (2015). doi:10.1007/s10107-015-0941-y · Zbl 1342.49037
[11] Chambolle, A., Caselles, V., Cremers, D., Novaga, M., Pock, T.: An introduction to total variation for image analysis. Radon. Ser. Comp. Appl. Math 9, 263-340 (2010) · Zbl 1209.94004
[12] Chambolle, A., DeVore, R.A., Lee, N.Y., Lucier, B.J.: Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEE Trans. Image Process. 7(3), 319-335 (1998) · Zbl 0993.94507 · doi:10.1109/83.661182
[13] Chang, K.-W., Hsieh, C.-J., Lin, C.-J.: Coordinate descent method for large-scale \[\ell_2\] ℓ2-loss linear support vector machines. J. Mach. Learn. Res. 9, 1369-1398 (2008) · Zbl 1225.68157
[14] Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory. 52(4), 1289-1306 (2006) · Zbl 1288.94016 · doi:10.1109/TIT.2006.871582
[15] Eckstein, J.: Augmented lagrangian and alternating direction methods for convex optimization: A tutorial and some illustrative computational results. RUTCOR Research Reports (2012) · Zbl 1180.65076
[16] Fountoulakis, K., Gondzio, J.: A second-order method for strongly convex \[\ell_1\] ℓ1-regularization problems. Math. Program. (2015). doi:10.1007/s10107-015-0875-4. http://www.maths.ed.ac.uk/ERGO/pdNCG/ · Zbl 1364.90255
[17] Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Mach. Learn. Res. 9, 627-650 (2008)
[18] Goldstein, T., O’Donoghue, B., Setze, S., Baraniuk, R.: Fast alternating direction optimization methods. SIAM J. Imaging. Sci. 7(3), 1588-1623 (2014) · Zbl 1314.49019 · doi:10.1137/120896219
[19] Gondzio, J.: Matrix-free interior point method. Comput. Optim. Appl. 51(2), 457-480 (2012) · Zbl 1241.90179 · doi:10.1007/s10589-010-9361-3
[20] Hale, E.T., Yin, W., Zhang, Y.: Fixed-point continuation method for \[\ell_1\] ℓ1-minimization: methodology and convergence. SIAM J. Optim. 19(3), 1107-1130 (2008) · Zbl 1180.65076 · doi:10.1137/070698920
[21] Hansen, P.C., Nagy, J.G., O’Leary, D.P.: Deblurring Images: Matrices, Spectra and Filtering. SIAM, Philadelphia (2006) · Zbl 1112.68127 · doi:10.1137/1.9780898718874
[22] Hansen, P.C., Pereyra, V., Scherer, G.: Least Squares Data Fitting with Applications. JHU Press, Baltimore (2012) · Zbl 1270.65008
[23] He, B., Yuan, X.: On the \[{\cal O}(1/t)O\](1/t) convergence rate of alternating direction method. SIAM J. Numer. Anal. 50(2), 700-709 (2012) · Zbl 1245.90084 · doi:10.1137/110836936
[24] Hsieh, C.-J., Chang, K.-W., Lin, C.-J., Keerthi, S.S., Sundararajan, S.: A dual coordinate descent method for large-scale linear SVM. Proceedings of the 25th international conference on machine learning, ICML 2008, pp. 408-415 (2008)
[25] Kim, S.-J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale \[\ell_1\] ℓ1-regularized least squares. IEEE J. Sel. Top. Signal. Process. 1(4), 606-617 (2007) · doi:10.1109/JSTSP.2007.910971
[26] Lorenz, D.A.: Constructing test instances for basis pursuit denoising. IEEE Trans. Signal. Process. 61(5), 1210-1214 (2013) · Zbl 1393.94345 · doi:10.1109/TSP.2012.2236322
[27] Lustig, M., Donoho, D., Pauly, J.M.: Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn. Reson. Med 58(6), 1182-1195 (2007) · doi:10.1002/mrm.21391
[28] Needell, D., Tropp, J.A.: Cosamp: Iterative signal recovery from incomplete and inaccurate samples. Appl. Comput. Harmon. Anal 26(3), 301-321 (2009) · Zbl 1163.94003 · doi:10.1016/j.acha.2008.07.002
[29] Nesterov, Y.: Introd. Lecture. Note. Convex. Optim. A Basic Course, Boston (2004) · doi:10.1007/978-1-4419-8853-9
[30] Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127-152 (2005) · Zbl 1079.90102 · doi:10.1007/s10107-004-0552-5
[31] Nesterov, Yu.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125-161 (2013) · Zbl 1287.90067 · doi:10.1007/s10107-012-0629-5
[32] Parikh, N., Boyd, S.: Proximal algorithms. J. Found. Trends. Optim. 1(3), 123-231 (2013)
[33] Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. Ser A. 144(1), 1-38 (2014) · Zbl 1301.65051 · doi:10.1007/s10107-012-0614-z
[34] Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. Math. Program. Ser A. 1-52 (2015). doi:10.1007/s10107-015-0901-6 · Zbl 1342.90102
[35] Scheinberg, K., Tang, X.: Practical inexact proximal quasi-Newton method with global complexity analysis. Technical report, March (2014). http://www.arxiv.org/abs/1311.6547 · Zbl 1366.90166
[36] Schmidt, M.: Graphical model structure learning with l1-regularization. PhD thesis, University British Columbia, (2010) · Zbl 1209.94004
[37] Shalev-Shwartz, S., Tewari, A.: Stochastic methods for \[\ell_1\] ℓ1-regularized loss minimization. J. Mach. Learn. Res. 12(4), 1865-1892 (2011) · Zbl 1280.62081
[38] Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory. Appl. 109(3), 475-494 (2001) · Zbl 1006.65062 · doi:10.1023/A:1017501703105
[39] Tseng, P.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22, 341-362 (2012) · Zbl 1257.90073 · doi:10.1137/100802001
[40] Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. Ser B. 117, 387-423 (2009) · Zbl 1166.90016 · doi:10.1007/s10107-007-0170-0
[41] Vattikuti, S., Lee, J.J., Chang, C.C., Hsu, S.D., Chow, C.C.: Applying compressed sensing to genome-wide association studies. Giga. Sci. 3(10), 1-17 (2014)
[42] Wang, Y., Yang, J., Yin, W., Zhang, Y.: A new alternating minimization algorithm for total variation image reconstruction. SIAM J. Imaging Sci. 1(3), 248-272 (2008) · Zbl 1187.68665 · doi:10.1137/080724265
[43] Wright, S.J.: Accelerated block-coordinate relaxation for regularized optimization. SIAM J. Optim. 22(1), 159-186 (2012) · Zbl 1357.49105 · doi:10.1137/100808563
[44] Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2(1), 224-244 (2008) · Zbl 1137.62045 · doi:10.1214/07-AOAS147
[45] Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., Lin, C.-J.: A comparison of optimization methods and software for large-scale \[\ell_1\] ℓ1-regularized linear classification. J. Mach. Learn. Res. 11, 3183-3234 (2010) · Zbl 1242.62065
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.