×

Nonsmooth optimization using Taylor-like models: error bounds, convergence, and termination criteria. (English) Zbl 1459.65083

Summary: We consider optimization algorithms that successively minimize simple Taylor-like models of the objective function. Methods of Gauss-Newton type for minimizing the composition of a convex function and a smooth map are common examples. Our main result is an explicit relationship between the step-size of any such algorithm and the slope of the function at a nearby point. Consequently, we (1) show that the step-sizes can be reliably used to terminate the algorithm, (2) prove that as long as the step-sizes tend to zero, every limit point of the iterates is stationary, and (3) show that conditions, akin to classical quadratic growth, imply that the step-sizes linearly bound the distance of the iterates to the solution set. The latter so-called error bound property is typically used to establish linear (or faster) convergence guarantees. Analogous results hold when the step-size is replaced by the square root of the decrease in the model’s value. We complete the paper with extensions to when the models are minimized only inexactly.

MSC:

65K05 Numerical mathematical programming methods
90C30 Nonlinear programming
65K10 Numerical optimization and variational techniques

Software:

GradSamp
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Aragón Artacho, FJ; Geoffroy, MH, Characterization of metric regularity of subdifferentials, J. Convex Anal., 15, 2, 365-380 (2008) · Zbl 1146.49012
[2] Attouch, H.; Bolte, J.; Svaiter, BF, Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods, Math. Program., 137, 1-2, 91-129 (2013) · Zbl 1260.49048
[3] Bai, Y., Duchi, J., Mei, S.: Proximal algorithms for constrained composite optimization, with applications to solving low-rank SDPs (2019). Preprint arXiv:1903.00184
[4] Beck, A.; Teboulle, M., A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2, 1, 183-202 (2009) · Zbl 1175.94009
[5] Bolte, J.; Daniilidis, A.; Lewis, AS; Shiota, M., Clarke subgradients of stratifiable functions, SIAM J. Optim., 18, 2, 556-572 (2007) · Zbl 1142.49006
[6] Bolte, J.; Daniilidis, A.; Ley, O.; Mazet, L., Characterizations of łojasiewicz inequalities: subgradient flows, talweg, convexity, Trans. Am. Math. Soc., 362, 6, 3319-3363 (2010) · Zbl 1202.26026
[7] Bolte, J.; Nguyen, TP; Peypouquet, J.; Suter, B., From error bounds to the complexity of first-order descent methods for convex functions, Math. Program., 165, 2, 471-507 (2017) · Zbl 1373.90076
[8] Bolte, J.; Sabach, S.; Teboulle, M., Proximal alternating linearized minimization for nonconvex and nonsmooth problems, Math. Program., 146, 1-2, 459-494 (2014) · Zbl 1297.90125
[9] Burke, JV, Descent methods for composite nondifferentiable optimization problems, Math. Program., 33, 3, 260-279 (1985) · Zbl 0581.90084
[10] Burke, JV; Ferris, MC, A Gauss-Newton method for convex composite optimization, Math. Program., 71, 2, 179-194 (1995) · Zbl 0846.90083
[11] Burke, JV; Lewis, AS; Overton, ML, A robust gradient sampling algorithm for nonsmooth, nonconvex optimization, SIAM J. Optim., 15, 3, 751-779 (2005) · Zbl 1078.65048
[12] Byrd, RH; Nocedal, J.; Oztoprak, F., An inexact successive quadratic approximation method for l-1 regularized optimization, Math. Program., 157, 2, 375-396 (2016) · Zbl 1342.49037
[13] Cartis, C.; Gould, NIM; Toint, PL, On the evaluation complexity of composite function minimization with applications to nonconvex nonlinear programming, SIAM J. Optim., 21, 4, 1721-1739 (2011) · Zbl 1236.90118
[14] Charisopoulos, V., Davis, D., Díaz, M., Drusvyatskiy, D.: Composite optimization for robust blind deconvolution (2019). arXiv preprint arXiv:1901.01624
[15] Clarke, FH; Ledyaev, Y.; Stern, RI; Wolenski, PR, Nonsmooth Analysis and Control Theory. Texts in Mathematics (1998), New York: Springer, New York · Zbl 1047.49500
[16] Davis, D.; Drusvyatskiy, D., Stochastic model-based minimization of weakly convex functions, SIAM J. Optim., 29, 1, 207-239 (2019) · Zbl 1415.65136
[17] De Giorgi, E.; Marino, A.; Tosques, M., Problemi di evoluzione in spazi metrici e curve di massima pendenza, Atti Acad. Nat. Lincei Rend. Cl. Sci. Fiz. Mat. Natur., 68, 180-187 (1980) · Zbl 0465.47041
[18] Drusvyatskiy, D.: Slope and geometry in variational mathematics. PhD thesis, Cornell University (2013)
[19] Drusvyatskiy, D.; Ioffe, AD, Quadratic growth and critical point stability of semi-algebraic functions, Math. Program., 153, 2, 635-653 (2015) · Zbl 1325.49021
[20] Drusvyatskiy, D.; Ioffe, AD; Lewis, AS, Curves of descent, SIAM J. Control Optim., 53, 1, 114-138 (2015) · Zbl 1345.26027
[21] Drusvyatskiy, D.; Ioffe, AD; Lewis, AS, Transversality and alternating projections for nonconvex sets, Found. Comput. Math., 15, 6, 1637-1651 (2015) · Zbl 1338.49057
[22] Drusvyatskiy, D., Ioffe, A.D., Lewis, A.S.: Nonsmooth optimization using Taylor-like models: error bounds, convergence, and termination criteria (2016). arXiv:1610.03446 (Ver. 1)
[23] Drusvyatskiy, D.; Lewis, AS, Error bounds, quadratic growth, and linear convergence of proximal methods, Math. Oper. Res., 43, 3, 919-948 (2018) · Zbl 1440.90046
[24] Drusvyatskiy, D.; Mordukhovich, BS; Nghia, TTA, Second-order growth, tilt stability, and metric regularity of the subdifferential, J. Convex Anal., 21, 4, 1165-1192 (2014) · Zbl 1311.49035
[25] Drusvyatskiy, D., Paquette, C.: Efficiency of minimizing compositions of convex functions and smooth maps. Math. Prog. (2016). doi:10.1007/s1010, arXiv:1605.00125 · Zbl 1431.90111
[26] Duchi, JC; Ruan, F., Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval, Inf. Infer. J. IMA, 8, 3, 471-529 (2018) · Zbl 1478.90084
[27] Ekeland, I., On the variational principle, J. Math. Anal. Appl., 47, 324-353 (1974) · Zbl 0286.49015
[28] Fletcher, R.; Sorensen, DC; Wets, RJB, A model algorithm for composite nondifferentiable optimization problems, Nondifferential and Variational Techniques in Optimization (Lexington, Ky., 1980). Mathematical Programming Studies, 67-76 (1982), Berlin: Springer, Berlin · Zbl 0478.90063
[29] Fletcher, R.; Sorensen, DC; Wets, RJB, A model algorithm for composite nondifferentiable optimization problems, Nondifferential and Variational Techniques in Optimization, 67-76 (1982), Berlin: Springer, Berlin · Zbl 0478.90063
[30] Geiping, J.; Moeller, M., Composite optimization by nonconvex majorization-minimization, SIAM J. Imaging Sci., 11, 4, 2494-2528 (2018) · Zbl 1419.90089
[31] Ghadimi, S.; Lan, G., Accelerated gradient methods for nonconvex nonlinear and stochastic programming, Math. Program., 156, 1-2, 59-99 (2016) · Zbl 1335.62121
[32] Goldstein, AA, Optimization of Lipschitz continuous functions, Math. Program., 13, 1, 14-22 (1977) · Zbl 0394.90088
[33] Ioffe, AD, Metric regularity and subdifferential calculus, Uspekhi Mat. Nauk, 55, 3-333, 103-162 (2000) · Zbl 0979.49017
[34] Ioffe, AD, Variational Analysis of Regular Mappings. Springer Monographs in Mathematics (2017), Berlin: Springer, Berlin · Zbl 1381.49001
[35] Jin, C., Netrapalli, P., Jordan, M.I.: Minmax optimization: stable limit points of gradient descent ascent are locally optimal (2019). arXiv preprint arXiv:1902.00618
[36] Klatte, D.; Kummer, B., Nonsmooth Equations in Optimization: Regularity, Calculus, Methods and Applications (2002), Dordrecht: Kluwer Academic Publishers, Dordrecht · Zbl 1173.49300
[37] Kurdyka, K., On gradients of functions definable in o-minimal structures, Ann. Inst. Fourier (Grenoble), 48, 3, 769-783 (1998) · Zbl 0934.32009
[38] Lewis, AS; Wright, SJ, A proximal method for composite minimization, Math. Program., 158, 1-46 (2015)
[39] Luo, Z-Q; Tseng, P., Error bounds and convergence analysis of feasible descent methods: a general approach, Ann. Oper. Res., 46/47, 1-4, 157-178 (1993) · Zbl 0793.90076
[40] Martinet, B., Régularisation d’inéquations variationnelles par approximations successives, Rev. Française Informat. Recherche Opérationnelle, 4, Ser. R-3, 154-158 (1970) · Zbl 0215.21103
[41] Martinet, B., Détermination approchée d’un point fixe d’une application pseudo-contractante. Cas de l’application prox, C. R. Acad. Sci. Paris Sér. A-B, 274, A163-A165 (1972) · Zbl 0226.47032
[42] Nesterov, Y.; Polyak, BT, Cubic regularization of Newton method and its global performance, Math. Program., 108, 1, 177-205 (2006) · Zbl 1142.90500
[43] Nesterov, Y., A method for solving the convex programming problem with convergence rate \(O(1/k^2)\), Dokl. Akad. Nauk SSSR, 269, 3, 543-547 (1983)
[44] Nesterov, Y., Modified Gauss-Newton scheme with worst case guarantees for global performance, Optim. Methods Softw., 22, 3, 469-483 (2007) · Zbl 1136.65051
[45] Nesterov, Y., Accelerating the cubic regularization of Newton’s method on convex problems, Math. Program., 112, 1, 159-181 (2008) · Zbl 1167.90013
[46] Nesterov, Y., Gradient methods for minimizing composite functions, Math. Program., 140, 1, 125-161 (2013) · Zbl 1287.90067
[47] Nocedal, J.; Wright, SJ, Numerical Optimization. Springer Series in Operations Research and Financial Engineering (2006), New York: Springer, New York · Zbl 1104.65059
[48] Noll, D.; Prot, O.; Rondepierre, A., A proximity control algorithm to minimize nonsmooth and nonconvex functions, Pac. J. Optim., 4, 3, 571-604 (2008) · Zbl 1162.49020
[49] Ochs, P.; Fadili, J.; Brox, T., Non-smooth non-convex Bregman minimization: unification and new algorithms, J. Optim. Theory Appl., 181, 1-35 (2017)
[50] Poliquin, RA; Rockafellar, RT, Prox-regular functions in variational analysis, Trans. Am. Math. Soc., 348, 1805-1838 (1996) · Zbl 0861.49015
[51] Polyak, BT, Gradient methods for the minimisation of functionals, USSR Comput. Math. Math. Phys., 3, 4, 864-878 (1963) · Zbl 0196.47701
[52] Powell, MJD; Chui, CK; Schumaker, LL; Ward, JD, General algorithms for discrete nonlinear approximation calculations, Approximation Theory, IV (College Station, Tex., 1983), 187-218 (1983), New York: Academic Press, New York · Zbl 0545.65047
[53] Powell, MJD, On the global convergence of trust region algorithms for unconstrained minimization, Math. Program., 29, 3, 297-303 (1984) · Zbl 0569.90069
[54] Rafique, H., Liu, M., Lin, Q., Yang, T.: Non-convex min-max optimization: provable algorithms and applications in machine learning (2018). arXiv preprint arXiv:1810.02060
[55] Rockafellar, RT, Monotone operators and the proximal point algorithm, SIAM J. Control Optim., 14, 5, 877-898 (1976) · Zbl 0358.90053
[56] Rockafellar, RT, Proximal subgradients, marginal values, and augmented Lagrangians in nonconvex optimization, Math. Oper. Res., 6, 3, 424-436 (1981) · Zbl 0492.90073
[57] Rockafellar, RT; Dontchev, AL, Implicit Functions and Solution Mappings. Monographs in Mathematics (2009), Berlin: Springer, Berlin · Zbl 1178.26001
[58] Scheinberg, K.; Tang, X., Practical Inexact Proximal Quasi-Newton Method with Global Complexity Analysis. Mathematical Programming, 1-35 (2016), Berlin: Springer, Berlin
[59] Wild, SM, Solving Derivative-Free Nonlinear Least Squares Problems with POUNDERS (2014), Lemont: Argonne National Lab, Lemont
[60] Wright, SJ, Convergence of an inexact algorithm for composite nonsmooth optimization, IMA J. Numer. Anal., 10, 3, 299-321 (1990) · Zbl 0705.65041
[61] Yuan, Y., On the superlinear convergence of a trust region algorithm for nonsmooth optimization, Math. Program., 31, 3, 269-285 (1985) · Zbl 0577.90066
[62] Zhang, R.; Treiman, J., Upper-Lipschitz multifunctions and inverse subdifferentials, Nonlinear Anal., 24, 2, 273-286 (1995) · Zbl 0826.49010
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.