×

An active-set proximal-Newton algorithm for \(\ell_1\) regularized optimization problems with box constraints. (English) Zbl 1458.90590

Summary: In this paper, we propose an active-set proximal-Newton algorithm for solving \(\ell_1\) regularized convex/nonconvex optimization problems subject to box constraints. Our algorithm first relies on the KKT error to estimate the active and free variables, and then smoothly combines the proximal gradient iteration and the Newton iteration to efficiently pursue the convergence of the active and free variables, respectively. We show the global convergence without the convexity of the objective function. For some structured convex problems, we further design a safe screening procedure that is able to identify/remove active variables, and can be integrated into the basic active-set proximal-Newton algorithm to accelerate the convergence. The algorithm is evaluated on various synthetic and real data, and the efficiency is demonstrated particularly on \(\ell_1\) regularized convex/nonconvex quadratic programs and logistic regression problems.

MSC:

90C30 Nonlinear programming
90C53 Methods of quasi-Newton type
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bagirov, AM; Karmitsa, N.; Mäkelä, MM, Introduction to Nonsmooth Optimization: Theory Practice and Software (2014), Berlin: Springer, Berlin · Zbl 1312.90053 · doi:10.1007/978-3-319-08114-4
[2] Beck, A.; Teboulle, M., A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2, 1, 183-202 (2009) · Zbl 1175.94009 · doi:10.1137/080716542
[3] Becker, S.; Bobin, J.; Candès, EJ, NESTA: a fast and accurate first-order method for sparse recovery, SIAM J. Imaging Sci., 4, 1-39 (2011) · Zbl 1209.90265 · doi:10.1137/090756855
[4] Bioucas-Dias, J.; Figueiredo, M., A new TwIST: two-step iterative shrinkage/thresholding algorithms for image restoration, IEEE Trans. Image Process., 6, 12, 2992-3004 (2007) · doi:10.1109/TIP.2007.909319
[5] Birgin, EG; Martinez, JM; Raydan, M., Nonmonotone spectral projected gradient methods on convex sets, SIAM J. Optim., 10, 4, 1196-1211 (2000) · Zbl 1047.90077 · doi:10.1137/S1052623497330963
[6] Byrd, RH; Chin, GM; Nocedal, J.; Oztoprak, F., A family of second-order methods for convex \(l_1\)-regularized optimization, Math. Program., 159, 435-467 (2016) · Zbl 1350.49046 · doi:10.1007/s10107-015-0965-3
[7] Byrd, RH; Nocedal, J.; Oztoprak, F., An inexact successive quadratic approximation method for \(\ell_1\) regularized optimization, Math. Program., 157, 375-396 (2016) · Zbl 1342.49037 · doi:10.1007/s10107-015-0941-y
[8] Candès, EJ; Recht, B., Exact matrix completion via convex optimization, Found. Comput. Math., 9, 717-772 (2009) · Zbl 1219.90124 · doi:10.1007/s10208-009-9045-5
[9] Chang, CC; Lin, CJ, LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol., 2, 3, 27:1-27:27 (2011) · doi:10.1145/1961189.1961199
[10] Dai, YH, A nonmonotone conjugate gradient algorithm for unconstrained optimization, J. Syst. Sci. Complex., 15, 2, 139-145 (2002) · Zbl 1019.90039
[11] Dolan, ED; Moré, J., Benchmarking optimization software with performance profiles, Math. Program., 91, 201-213 (2002) · Zbl 1049.90004 · doi:10.1007/s101070100263
[12] Donoho, DL, Compressed sensing, IEEE Trans. Inf. Theory, 52, 4, 1289-1306 (2006) · Zbl 1288.94016 · doi:10.1109/TIT.2006.871582
[13] El Ghaoui, L.; Viallon, V.; Rabbani, T., Safe feature elimination in sparse supervised learning, J. Pacific Optim., 8, 4, 667-698 (2012) · Zbl 1259.65010
[14] Fercoq, O., Gramfort, A., Salmon, J.: Mind the duality gap: safer rules for the lasso. In: ICML, pp. 333-342 (2015)
[15] Figueiredo, MAT; Nowak, RD; Wright, SJ, Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems, IEEE J. Sel. Topics Signal Process., 1, 4, 586-597 (2007) · doi:10.1109/JSTSP.2007.910281
[16] Fountoulakis, K.; Gondzio, J.; Zhlobich, P., Matrix-free interior point method for compressed sensing problems, Math. Program. Comput., 6, 1-31 (2014) · Zbl 1304.90137 · doi:10.1007/s12532-013-0063-6
[17] Friedman, J.; Hastie, T.; Tibshirani, R., Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 432-441 (2008) · Zbl 1143.62076 · doi:10.1093/biostatistics/kxm045
[18] Gafni, EM; Bertsekas, DP, Two-metric projection methods for constrained optimization, SIAM J. Control Optim., 22, 6, 936-964 (1984) · Zbl 0555.90086 · doi:10.1137/0322061
[19] Grippo, L.; Lampariello, F.; Lucidi, S., A nonmonotone line search technique for Newton’s method, SIAM J. Numer. Anal., 23, 4, 707-716 (1986) · Zbl 0616.65067 · doi:10.1137/0723046
[20] Grippo, L.; Lampariello, F.; Lucidi, S., A truncated Newton method with nonmonotone line search for unconstrained optimization, J. Optim. Theory Appl., 60, 3, 401-419 (1989) · Zbl 0632.90059 · doi:10.1007/BF00940345
[21] Hale, ET; Yin, W.; Zhang, Y., Fixed-point continuation for \(l_1\)-minimization: methodology and convergence, SIAM J. Optim., 19, 1107-1130 (2008) · Zbl 1180.65076 · doi:10.1137/070698920
[22] Hiriart-Urruty, J-B; Lemaréchal, C., Convex Analysis and Minimization Algorithms II (1993), Berlin: Springer-Verlag, Berlin · Zbl 0795.49001 · doi:10.1007/978-3-662-06409-2
[23] Keskar, N.; Nocedal, J.; Oztoprak, F.; Wachter, A., A second-order method for convex \(l_1\)-regularized optimization with active-set prediction, Optim. Methods Softw., 31, 605-621 (2016) · Zbl 1341.49039 · doi:10.1080/10556788.2016.1138222
[24] Li, X.; Sun, D.; Toh, K., A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems, SIAM J. Optim., 28, 1, 433-458 (2018) · Zbl 1392.65062 · doi:10.1137/16M1097572
[25] Lopes, R.; Santos, SA; Silva, PJS, Accelerating block coordinate descent methods with identification strategies, Comput. Optim. Appl., 72, 3, 609-640 (2019) · Zbl 1414.90327 · doi:10.1007/s10589-018-00056-8
[26] Lucidi, S.; Rochetich, F.; Roma, M., Curvilinear stabilization techniques for truncated Newton methods in large-scale unconstrained optimization, SIAM J. Optim., 8, 4, 916-939 (1998) · Zbl 0912.90259 · doi:10.1137/S1052623495295250
[27] Maros, I., C. Mészáros C, : A repository of convex quadratic programming problems. Optim. Methods Softw. 11&12, 671-681 (1999) · Zbl 0973.90520
[28] Milzarek, A.; Ulbrich, M., A semismooth Newton method with multidimensional filter globalization for \(\ell_1\)-optimization, SIAM J. Optim., 24, 1, 298-333 (2014) · Zbl 1295.49022 · doi:10.1137/120892167
[29] Mohri, M.; Rostamizadeh, A.; Talwalkar, A., Foundations of Machine Learning (2012), Cambridge: The MIT Press, Cambridge · Zbl 1318.68003
[30] Ndiaye, E., Fercoq, O., Gramfort, A., Salmon, J.: Gap safe screening rules for sparse multi-task and multi-class models. In: NIPS, pp. 811-819 (2015)
[31] Ndiaye, E., Fercoq, O., Gramfort, A., Salmon, J.: GAP safe screening rules for sparse group lasso. In: NIPS, pp 388-396 (2016)
[32] Nesterov, Y., A method of solving a convex programming problem with convergence rate \({\cal{O}}(1/k^2)\), Soviet Math. Dokl., 27, 372-376 (1983) · Zbl 0535.90071
[33] Nocedal, J.; Wright, S., Numerical Optimization (2006), New York: Springer, New York · Zbl 1104.65059
[34] Parikh, N.; Boyd, S., Proxmal algorithms, Found. Trends Optim., 1, 3, 123-231 (2013)
[35] Platt, JC; Schölkopf, Bernhard; Burges, Christopher J. C.; Smola, Alexander J., Fast training of support vector machines using sequential minimal optimization, Advances in Kernel Methods—Support Vector Learning (1998), Cambridge, MA: MIT Press, Cambridge, MA · Zbl 0935.68084
[36] Prokhorov, D.: IJCNN 2001 neural network competition. Slide presentation in IJCNN1, Ford Research Laboratory (2001)
[37] Richtárik, P., Takác̀ M., : Efficient Serial and Parallel Coordinate Descent Methods For Huge-Scale Truss Topology Design. Springer, Berlin (2012)
[38] Tibshirani, R., Regression shrinkage and selection via the lasso., J. R. Stat. Soc. Ser. B Stat. Methodol., 58, 267-288 (1996) · Zbl 0850.62538
[39] UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/machine-learningdatabases/statlog/german/
[40] Uzilov, AV; Keegan, JM; Mathews, DH, Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change, BMC Bioinformatics, 7, 173 (2006) · doi:10.1186/1471-2105-7-173
[41] van den Berg, E.; Friedlander, MP, Probing the Pareto frontier for basis pursuit solutions, SIAM J. Sci. Comput., 31, 890-912 (2008) · Zbl 1193.49033 · doi:10.1137/080714488
[42] Wen, Z.; Yin, W.; Goldfarb, D.; Zhang, Y., A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation, SIAM J. Sci Comput., 32, 4, 1832-1857 (2010) · Zbl 1215.49039 · doi:10.1137/090747695
[43] Wright, SJ; Nowak, RD; Figueiredo, MAT, Sparse reconstruction by separable approximation, IEEE Trans. Signal Process., 57, 7, 2479-2493 (2009) · Zbl 1391.94442 · doi:10.1109/TSP.2009.2016892
[44] Xiao, X.; Li, Y.; Wen, Z.; Zhang, L., A regularized semi-smooth Newton method with projection steps for composite convex programs, J. Sci. Comput., 76, 1, 364-389 (2018) · Zbl 1394.90534 · doi:10.1007/s10915-017-0624-3
[45] Yan, Y.M.: Maros and Mészáros Convex QP test problems in MAT format, https://github.com/YimingYAN/QP-Test-Problems
[46] Yuan, M.; Lin, Y., Model selection and estimation in regression with grouped variables, J. R. Stat. Soc. Ser. B (Stat. Methodol.), 68, 1, 49-67 (2006) · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[47] Zhang, H.; Cheng, L., Projected shrinkage algorithm for box-constrained \(l_1\)-minimization, Optim. Lett., 11, 55-70 (2017) · Zbl 1398.90175 · doi:10.1007/s11590-015-0983-3
[48] Zhang, H.; Hager, WW, A nonmonotone line search technique and its application to unconstrained optimization, SIAM J. Optim., 14, 4, 1043-1056 (2004) · Zbl 1073.90024 · doi:10.1137/S1052623403428208
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.