×

zbMATH — the first resource for mathematics

Stochastic model-based minimization of weakly convex functions. (English) Zbl 1415.65136

MSC:
65K05 Numerical mathematical programming methods
90C15 Stochastic programming
90C25 Convex programming
90C30 Nonlinear programming
PDF BibTeX XML Cite
Full Text: DOI arXiv
References:
[1] M. Abadi, A. Agarwal, P. Barham, et al., TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, White paper, TensorFlow, 2015; available online from .
[2] E. Abbe, A. Bandeira, A. Bracher, and A. Singer, Decoding binary node labels from censored edge measurements: Phase transition and efficient recovery, IEEE Trans. Network Sci. Eng., 1 (2014), pp. 10–22, .
[3] Z. Allen-Zhu, Katyusha: The first direct acceleration of stochastic gradient methods, in STOC 2017: Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, ACM, New York, 2017, pp. 1200–1205. · Zbl 1369.68273
[4] Z. Allen-Zhu, Natasha 2: Faster Non-Convex Optimization than SGD, preprint, , 2017.
[5] Z. Allen-Zhu, How to Make the Gradients Small Stochastically, preprint, , 2018.
[6] A. Bandeira, N. Boumal, and V. Voroninski, On the low-rank approach for semidefinite programs arising in synchronization and community detection, in Proceedings of the 29th Conference on Learning Theory, Proc. Mach. Learn. Res. 49, 2016, pp. 361–382; available at .
[7] A. Ben-Tal and M. Teboulle, Expected utility, penalty functions, and duality in stochastic nonlinear programming, Management Sci., 32 (1986), pp. 1445–1466, . · Zbl 0625.90064
[8] A. Ben-Tal and M. Teboulle, An old-new concept of convex risk measures: The optimized certainty equivalent, Math. Finance, 17 (2007), pp. 449–476, . · Zbl 1186.91116
[9] P. Bianchi, Ergodic convergence of a stochastic proximal point algorithm, SIAM J. Optim., 26 (2016), pp. 2235–2260. · Zbl 1355.90062
[10] J. Burke, Descent methods for composite nondifferentiable optimization problems, Math. Program., 33 (1985), pp. 260–279. · Zbl 0581.90084
[11] E. Candès, X. Li, Y. Ma, and J. Wright, Robust principal component analysis?, J. ACM, 58 (2011), Article no. 11.
[12] E. Candès, X. Li, and M. Soltanolkotabi, Phase retrieval via Wirtinger flow: theory and algorithms, IEEE Trans. Inform. Theory, 61 (2015), pp. 1985–2007, . · Zbl 1359.94069
[13] T. F. Chan and C.-K. Wong, Total variation blind deconvolution, IEEE Trans. Image Process., 7 (1998), pp. 370–375, .
[14] V. Chandrasekaran, S. Sanghavi, P. A. Parrilo, and A. Willsky, Rank-sparsity incoherence for matrix decomposition, SIAM J. Optim., 21 (2011), pp. 572–596, . · Zbl 1226.90067
[15] Y. Chen and E. Candès, Solving random quadratic systems of equations is nearly as easy as solving linear systems, Comm. Pure Appl. Math., 70 (2017), pp. 822–883, . · Zbl 1379.90024
[16] Y. Chen, Y. Chi, and A. Goldsmith, Exact and stable covariance estimation from quadratic sampling via convex programming, IEEE Trans. Inform. Theory, 61 (2015), pp. 4034–4059, . · Zbl 1359.62181
[17] F. H. Clarke, Y. S. Ledyaev, R. J. Stern, and P. R. Wolenski, Nonsmooth Analysis and Control Theory, Grad. Texts in Math. 178, Springer, New York, 1998. · Zbl 1047.49500
[18] J. Cruz, On proximal subgradient splitting method for minimizing the sum of two nonsmooth convex functions, Set-Valued Var. Anal., 25 (2017), pp. 245–263, . · Zbl 1365.65160
[19] D. Davis and D. Drusvyatskiy, Complexity of Finding Near-Stationary Points of Convex Functions Stochastically, preprint, , 2018.
[20] D. Davis and D. Drusvyatskiy, Stochastic Model-Based Minimization of Weakly Convex Functions, preprint, , 2018. · Zbl 1415.65136
[21] D. Davis and D. Drusvyatskiy, Stochastic Subgradient Method Converges at the Rate \(O(k^{-1/4})\) on Weakly Convex Functions, preprint, , 2018.
[22] D. Davis, D. Drusvyatskiy, and C. Paquette, The Nonsmooth Landscape of Phase Retrieval, preprint, , 2017.
[23] D. Davis and B. Grimmer, Proximally Guided Stochastic Method for Nonsmooth, Nonconvex Problems, preprint, , 2017.
[24] O. Devolder, F. Glineur, and Y. Nesterov, First-order methods of smooth convex optimization with inexact oracle, Math. Program., 146 (2014), pp. 37–75, . · Zbl 1317.90196
[25] D. Drusvyatskiy, The proximal point method revisited, SIAG/OPT Views and News, 26 (2018), pp. 1–8.
[26] D. Drusvyatskiy, A. Ioffe, and A. Lewis, Nonsmooth Optimization Using Taylor-Like Models: Error Bounds, Convergence, and Termination Criteria, preprint, , 2016.
[27] D. Drusvyatskiy and A. S. Lewis, Error bounds, quadratic growth, and linear convergence of proximal methods, Math. Oper. Res., 43 (2018), pp. 919–948, .
[28] D. Drusvyatskiy and C. Paquette, Efficiency of minimizing compositions of convex functions and smooth maps, Math. Program. (2018), . · Zbl 1398.49012
[29] J. Duchi and F. Ruan, Solving (Most) of a Set of Quadratic Equalities: Composite Optimization for Robust Phase Retrieval, preprint, , 2017.
[30] J. Duchi and F. Ruan, Stochastic Methods for Composite Optimization Problems, preprint, , 2017.
[31] J. Duchi and Y. Singer, Efficient online and batch learning using forward backward splitting, J. Mach. Learn. Res., 10 (2009), pp. 2899–2934. · Zbl 1235.62151
[32] P. Dvurechensky, Gradient Method with Inexact Oracle for Composite Non-Convex Optimization, preprint, , 2017.
[33] Y. Eldar and S. Mendelson, Phase retrieval: Stability and recovery guarantees, Appl. Comput. Harmon. Anal., 36 (2014), pp. 473–494, . · Zbl 06298184
[34] S. Ghadimi and G. Lan, Stochastic first- and zeroth-order methods for nonconvex stochastic programming, SIAM J. Optim., 23 (2013), pp. 2341–2368. · Zbl 1295.90026
[35] S. Ghadimi, G. Lan, and H. Zhang, Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization, Math. Program., 155 (2016), pp. 267–305. · Zbl 1332.90196
[36] A. Juditsky and A. Nemirovski, First order methods for nonsmooth convex large-scale optimization, I: General purpose methods, in Optimization for Machine Learning, S. Sra, S. Nowozin, and S. Write, eds., MIT Press, Cambridge, MA, 2011, pp. 266–290.
[37] L. Lei, C. Ju, J. Chen, and M. I. Jordan, Non-convex finite-sum optimization via SCSG methods, in Adv. Neural Inf. Process. Syst. 30, Curran Associates, Red Hook, NY, 2017, pp. 2345–2355.
[38] A. Levin, Y. Weiss, F. Durand, and W. T. Freeman, Understanding blind deconvolution algorithms, IEEE Trans. Pattern Anal. Mach. Intell., 33 (2011), pp. 2354–2367, .
[39] A. Lewis and S. Wright, A proximal method for composite minimization, Math. Program., 158 (2016), pp. 501–546, . · Zbl 1345.49041
[40] S. Ling and T. Strohmer, Self-calibration and biconvex compressive sensing, Inverse Probl., 31 (2015), Number 11, . · Zbl 1327.93183
[41] J. Mairal, J. Ponce, G. Sapiro, A. Zisserman, and F. Bach, Supervised dictionary learning, in Adv. Neural Inf. Process. Syst. 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, eds., Curran Associates, Red Hook, NY, 2009, pp. 1033–1040, .
[42] J.-J. Moreau, Proximité et dualité dans un espace hilbertien, Bull. Soc. Math. France, 93 (1965), pp. 273–299, . · Zbl 0136.12101
[43] A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro, Robust stochastic approximation approach to stochastic programming, SIAM J. Optim., 19 (2009), pp. 1574–1609. · Zbl 1189.90109
[44] A. Nemirovsky and D. Yudin, Problem Complexity and Method Efficiency in Optimization, Wiley-Intersci. Ser. Discrete Math. Optim., John Wiley, New York, 1983.
[45] Y. Nesterov, How to make the gradients small, OPTIMA: Math. Optim. Soc. Newsletter, 88 (2012), pp. 10–11.
[46] Y. Nesterov, Gradient methods for minimizing composite functions, Math. Program., 140 (2013), pp. 125–161, . · Zbl 1287.90067
[47] Y. Nesterov, Universal gradient methods for convex optimization problems, Math. Program., 152 (2015), pp. 381–404. · Zbl 1327.90216
[48] E. A. Nurminskii, Minimization of nondifferentiable functions in the presence of noise, Cybernetics, 10 (1974), pp. 619–621, .
[49] E. A. Nurminskii, The quasigradient method for the solving of the nonlinear programming problems, Cybernetics, 9 (1973), pp. 145–150, .
[50] R. A. Poliquin and R. T. Rockafellar, Amenable functions in optimization, in Nonsmooth Optimization: Methods and Applications, Gordon and Breach, New York, 1992, pp. 338–353. · Zbl 1050.49513
[51] R. Poliquin and R. T. Rockafellar, Prox-regular functions in variational analysis, Trans. Amer. Math. Soc., 348 (1996), pp. 1805–1838. · Zbl 0861.49015
[52] S. J. Reddi, S. Sra, B. Poczos, and A. J. Smola, Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization, in Adv. Neural Inf. Process. Syst. 29, Curran Associates, Red Hook, NY, 2016, pp. 1145–1153.
[53] H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statistics, 22 (1951), pp. 400–407. · Zbl 0054.05901
[54] R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1970. · Zbl 0193.18401
[55] R. T. Rockafellar, Favorable classes of Lipschitz-continuous functions in subgradient optimization, in Progress in Nondifferentiable Optimization, IIASA Collaborative Proc. Ser. 8, IIASA, Laxenburg, 1981, pp. 125–143.
[56] R. T. Rockafellar and S. Uryasev, The fundamental risk quadrangle in risk management, optimization and statistical estimation, Surv. Oper. Res. Management Sci., 18 (2013), pp. 33–53, .
[57] R. T. Rockafellar and R.-B. Wets, Variational Analysis, Grundlehren Math. Wiss. 317, Springer, Berlin, 1998.
[58] R. T. Rockafellar, Convex Analysis, Princet. Math. Ser. 28, Princeton University Press, Princeton, NJ, 1970. · Zbl 0193.18401
[59] R. T. Rockafellar and S. Uryasev, Optimization of conditional value-at-risk, J. Risk, 2 (2000), pp. 21–41.
[60] E. K. Ryu and S. Boyd, Stochastic Proximal Iteration: A Non-Asymptotic Improvement Upon Stochastic Gradient Descent, preprint, , 2014.
[61] M. Schmidt, N. L. Roux, and F. Bach, A Simpler Approach to Obtaining an \(O(1/t)\) Convergence Rate for the Projected Stochastic Subgradient Method, preprint, , 2013.
[62] A. Singer, Angular synchronization by eigenvectors and semidefinite programming, Appl. Comput. Harmon. Anal., 30 (2011), pp. 20–36, . · Zbl 1206.90116
[63] J. Sun, Q. Qu, and J. Wright, A geometric analysis of phase retrieval, Found. Comput. Math., 18 (2017), pp. 1131–1198. · Zbl 1401.94049
[64] I. Tosic and P. Frossard, Dictionary learning, IEEE Signal Process., 28 (2011), pp. 27–38, . · Zbl 1372.94246
[65] P. Toulis and E. Airoldi, Asymptotic and finite-sample properties of estimators based on stochastic gradients, Ann. Statist., 45 (2017), pp. 1694–1727, . · Zbl 1378.62046
[66] P. Toulis, T. Horel, and E. Airoldi, Stable Robbins–Monro Approximations Through Stochastic Proximal Updates, preprint, , 2015.
[67] M. Wang, E. Fang, and H. Liu, Stochastic compositional gradient descent: Algorithms for minimizing compositions of expected-value functions, Math. Program., 161 (2017), pp. 419–449, . · Zbl 1356.90099
[68] Y. Xu and W. Yin, Block stochastic gradient iteration for convex and nonconvex optimization, SIAM J. Optim., 25 (2015), pp. 1686–1716. · Zbl 1342.93125
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.