zbMATH — the first resource for mathematics

Stochastic methods for composite and weakly convex optimization problems. (English) Zbl 06989166

65K10 Numerical optimization and variational techniques
Full Text: DOI
[1] J.-P. Aubin and A. Cellina, Differential Inclusions: Set-Valued Maps and Viability Theory, Springer-Verlag, Berlin, 1984. · Zbl 0538.34007
[2] M. Benaïm, J. Hofbauer, and S. Sorin, Stochastic approximations and differential inclusions, SIAM J. Control Optim., 44 (2005), pp. 328–348. · Zbl 1087.62091
[3] D. P. Bertsekas, Stochastic optimization problems with nondifferentiable cost functionals, J. Optim. Theory Appl., 12 (1973), pp. 218–231. · Zbl 0248.90043
[4] J. Bolte, A. Daniilidis, A. S. Lewis, and M. Shiota, Clarke subgradients of stratifiable functions, SIAM J. Optim., 18 (2007), pp. 556–572. · Zbl 1142.49006
[5] J. Bolte, A. Daniilidis, O. Ley, and L. Mazet, Characterizations of Łojasiewicz inequalities: Subgradient flows, talweg, convexity, Trans. Amer. Math. Soc., 362 (2010), pp. 3319–3363. · Zbl 1202.26026
[6] V. Borkar, Stochastic Approximation, Cambridge University Press, Cambridge, UK, 2008. · Zbl 1181.62119
[7] J. Burke, Descent methods for composite nondifferentiable optimization problems, Math. Program., 33 (1985), pp. 260–279. · Zbl 0581.90084
[8] E. Candès, T. Strohmer, and V. Voroninski, PhaseLift: Exact and stable signal recovery from magnitude measurements via convex programming, Comm. Pure Appl. Math., 66 (8), 2013.
[9] E. J. Candès, X. Li, and M. Soltanolkotabi, Phase retrieval via Wirtinger flow: Theory and algorithms, IEEE Trans. Inform. Theory, 61 (2015), pp. 1985–2007. · Zbl 1359.94069
[10] Y. Chen and E. Candès, Solving random quadratic systems of equations is nearly as easy as solving linear systems, Comm. Pure Appl. Math., 2015, to appear.
[11] A. R. Conn, N. I. M. Gould, and P. L. Toint, Trust Region Methods, MOS-SIAM Ser. Optim., Philadelphia, 2000.
[12] D. Davis and D. Drusvyatskiy, Stochastic subgradient method converges at the rate \({O}(k^{-1/4})\) on weakly convex functions, preprint, [cs.LG], 2018.
[13] D. Davis and D. Drusvyatskiy, Stochastic model-based minimization of weakly convex functions, preprint, [cs.LG], 2018.
[14] D. Davis and B. Grimmer, Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems, preprint, [cs.LG], 2017.
[15] D. Davis, D. Drusvyatskiy, S. Kakade, and J. D. Lee, Stochastic subgradient method converges on tame functions, preprint, [cs.LG], 2018.
[16] A. Dembo, Probability Theory: STAT310/MATH230, lecture notes, Stanford University, Stanford, CA, 2016, .
[17] P. A. Dorofeyev, Some properties of the generalized gradient method, Comput. Math. Math. Phys., 25 (1985), pp. 117–122.
[18] D. Drusvyatskiy, The proximal point method revisited, preprint, [math.OC], 2018, .
[19] D. Drusvyatskiy and A. Lewis, Error bounds, quadratic growth, and linear convergence of proximal methods, Math. Oper. Res., 43 (2018), pp. 919–948.
[20] D. Drusvyatskiy and C. Paquette, Efficiency of minimizing compositions of convex functions and smooth maps, preprint, [math.OC], 2016.
[21] D. Drusvyatskiy, A. Ioffe, and A. Lewis, Nonsmooth optimization using Taylor-like models: Error bounds, convergence, and termination criteria, preprint, [math.OS], 2016.
[22] J. C. Duchi and F. Ruan, Stochastic methods for composite and weakly convex optimization problems, preprint, [math.OC], 2017.
[23] J. C. Duchi and F. Ruan. Solving (most) of a set of quadratic equalities: Composite optimization for robust phase retrieval, Inf. Inference, 2018, to appear.
[24] J. C. Duchi, E. Hazan, and Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learning Res., 12 (2011), pp. 2121–2159. · Zbl 1280.68164
[25] Y. M. Ermoliev, On the stochastic quasi-gradient method and stochastic quasi-Feyer sequences, Kibernetika, 2 (1969), pp. 72–83.
[26] Y. M. Ermoliev and V. Norkin, Stochastic generalized gradient method for nonconvex nonsmooth stochastic optimization, Cybernet. Systems Anal., 34 (1998), pp. 196–215. · Zbl 0930.90074
[27] Y. M. Ermoliev and V. Norkin, Solution of nonconvex nonsmooth stochastic optimization problems, Cybernet. Systems Anal., 39 (2003), pp. 701–715. · Zbl 1066.90071
[28] R. Fletcher, A model algorithm for composite nondifferentiable optimization problems, in Nondifferential and Variational Techniques in Optimization, D. C. Sorensen and R. J.-B. Wets, eds., Math. Program. Studies 17, Springer, Berlin, 1982, pp. 67–76. · Zbl 0478.90063
[29] R. Fletcher and G. A. Watson, First and second order conditions for a class of nondifferentiable optimization problems, Math. Program., 18 (1980), pp. 291–307. · Zbl 0433.90066
[30] A. M. Gupal, Stochastic Methods for Solving Nonsmooth Extremal Problems, Naukova Dumka, Kiev, 1979 (In Ukrainian).
[31] M. Hardt and T. Ma, Identity matters in deep learning, in Proceedings of the Fifth International Conference on Learning Representations, 2017.
[32] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed., Springer, New York, 2009. · Zbl 1273.62005
[33] J. Hiriart-Urruty and C. Lemaréchal, Convex Analysis and Minimization Algorithms I & II, Springer, New York, 1993.
[34] A. D. Ioffe, Critical values of set-valued maps with stratifiable graphs. extensions of Sard and Smale-Sard theorems, Proc. Amer. Math. Soc., 136 (2008), pp. 3111–3119. · Zbl 1191.49015
[35] A. D. Ioffe, Variational Analysis of Regular Mappings, Springer, New York, 2017. · Zbl 1381.49001
[36] M. Kunze, Non-Smooth Dynamical Systems, Lect. Notes Math. 1744, Springer, New York, 2000. · Zbl 0965.34026
[37] H. J. Kushner and G. Yin, Stochastic Approximation and Recursive Algorithms and Applications, 2nd ed., Springer, New York, 2003. · Zbl 1026.62084
[38] A. S. Lewis and S. J. Wright, A proximal method for composite minimization, preprint, [math.OC], 2018.
[39] L. Ljung, Analysis of recursive stochastic algorithms, IEEE Trans. Automat. Control, 22 (1977), pp. 551–575. · Zbl 0362.93031
[40] A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro, Robust stochastic approximation approach to stochastic programming, SIAM J. Optim., 19 (2009), pp. 1574–1609. · Zbl 1189.90109
[41] Y. Nesterov, Introductory Lectures on Convex Optimization, Kluwer Academic Publishers, Dordrecht, the Netherlands, 2004. · Zbl 1086.90045
[42] J. Nocedal and S. J. Wright, Numerical Optimization, Springer, New York, 2006. · Zbl 1104.65059
[43] R. Poliquin and R. T. Rockafellar, Prox-regular functions in variational analysis, Trans. AMS, 348 (1996), pp. 1805–1838. · Zbl 0861.49015
[44] H. Robbins and D. Siegmund, A convergence theorem for non-negative almost supermartingales and some applications, in Optimizing Methods in Statistics, Academic Press, New York, 1971, pp. 233–257.
[45] R. T. Rockafellar, Measurable dependence of convex sets and functions on parameters, J. Math. Anal. Appl., 28 (1969), pp. 4–25. · Zbl 0202.33804
[46] R. T. Rockafellar, Monotone operators and the proximal point algorithm, SIAM J. Control Optim., 14 (1976), pp. 877–898. · Zbl 0358.90053
[47] R. T. Rockafellar and R. J. B. Wets, Variational Analysis, Springer, New York, 1998. · Zbl 0888.49001
[48] A. Ruszczyński, A linearization method for nonsmooth stochastic programming problems, Math. Oper. Res., 12 (1987), pp. 32–49. · Zbl 0624.90078
[49] Y. Schechtman, Y. C. Eldar, O. Cohen, H. N. Chapman, J. Miao, and M. Segev, Phase retrieval with application to optical imaging, IEEE Signal Process. Magazine, 2015, pp. 87–109.
[50] W. Su, S. Boyd, and E. Candes, A differential equation for modeling nesterov’s accelerated gradient method: Theory and insights, in Advances in Neural Information Processing Systems 27, 2014, pp. 2510–2518.
[51] R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B. Stat. Methodol., 58 (1996), pp. 267–288. · Zbl 0850.62538
[52] M. Udell, K. Mohan, D. Zeng, J. Hong, S. Diamond, and S. Boyd, Convex optimization in Julia, in First Workshop on High Performance Technical Computing in Dynamic Languages, IEEE, Piscataway, NJ, 2014, pp. 18–28.
[53] M. Wang, J. Liu, and E. Fang, Accelerating stochastic composition optimization, in Advances in Neural Information Processing Systems 29, 2016, pp. 1714–1722.
[54] M. Wang, E. Fang, and H. Liu, Stochastic compositional gradient descent: Algorithms for minimizing compositions of expected-value functions, Math. Program. Ser. A, 161 (2017), pp. 419–449. · Zbl 1356.90099
[55] H. Whitney, A function not constant on a connected set of critical points, Duke Math. J., 1 (1935), pp. 514–517. · Zbl 0013.05801
[56] A. Wibisono, A. Wilson, and M. I. Jordan, A variational perspective on accelerated methods in optimization, Proc. Natl. Acad. Sci. USA, 113 (2016), pp. 7351–7358.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.