## DC formulations and algorithms for sparse optimization problems.(English)Zbl 06869181

Summary: We propose a DC (Difference of two Convex functions) formulation approach for sparse optimization problems having a cardinality or rank constraint. With the largest-$$k$$ norm, an exact DC representation of the cardinality constraint is provided. We then transform the cardinality-constrained problem into a penalty function form and derive exact penalty parameter values for some optimization problems, especially for quadratic minimization problems which often appear in practice. A DC Algorithm (DCA) is presented, where the dual step at each iteration can be efficiently carried out due to the accessible subgradient of the largest-$$k$$ norm. Furthermore, we can solve each DCA subproblem in linear time via a soft thresholding operation if there are no additional constraints. The framework is extended to the rank-constrained problem as well as the cardinality- and the rank-minimization problems. Numerical experiments demonstrate the efficiency of the proposed DCA in comparison with existing methods which have other penalty terms.

### MSC:

 47A30 Norms (inequalities, more than one norm, etc.) of linear operators 90C20 Quadratic programming 90C26 Nonconvex programming, global optimization 90C90 Applications of mathematical programming

### Software:

SparseLOGREG; UCI-ml
Full Text:

### References:

 [1] Alizadeh, F, Interior point methods in semidefinite programming with applications to combinatorial optimization, SIAM J. Optim., 5, 13-51, (1995) · Zbl 0833.90087 [2] Arthanari, T.S., Dodge, Y.: Mathematical Programming in Statistics, vol. 341. Wiley, New York (1981) · Zbl 0549.62002 [3] Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011) · Zbl 1218.47001 [4] Beck, A; Teboulle, M, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2, 183-202, (2009) · Zbl 1175.94009 [5] Bertsimas, D; Pachamanova, D; Sim, M, Robust linear optimization under general norms, Oper. Res. Lett., 32, 510-516, (2004) · Zbl 1054.90046 [6] Bertsimas, D; King, A; Mazumder, R, Best subset selection via a modern optimization Lens, Ann. Stat., 44, 813-852, (2016) · Zbl 1335.62115 [7] Blake, C.L., Merz, C.J.: UCI repository of machine learning databases, (1998). http://www.ics.uci.edu/ mlearn/MLRepository.html · Zbl 1178.68619 [8] Blum, M; Floyd, RW; Pratt, V; Rivest, RL; Tarjan, RE, Time bounds for selection, J. Comput. Syst. Sci., 7, 448-461, (1973) · Zbl 0278.68033 [9] Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: International Conference on Machine Learning, Volume 98, pp. 82-90. (1998) · Zbl 0895.90152 [10] Brodie, J; Daubechies, I; Mol, C; Giannone, D; Loris, I, Sparse and stable Markowitz portfolios, Proc. Natl. Acad. Sci., 106, 12267-12272, (2009) · Zbl 1203.91271 [11] Bruckstein, AM; Donoho, DL; Elad, M, From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Rev., 51, 34-81, (2009) · Zbl 1178.68619 [12] Cai, J; Candes, EJ; Shen, Z, A singular value thresholding algorithm for matrix completion, SIAM J. Optim., 20, 1956-1982, (2010) · Zbl 1201.90155 [13] Cai, X., Nie, F., Huang, H.: Exact top-$$k$$ feature selection via $$ℓ _{2,0}$$-norm constraint. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (2013) · Zbl 1006.90062 [14] Candès, EJ; Recht, B, Exact matrix completion via convex optimization, Found. Comput. Math., 9, 717-772, (2009) · Zbl 1219.90124 [15] Candès, EJ; Tao, T, Decoding by linear programming, IEEE Trans. Inf. Theory, 51, 4203-4215, (2005) · Zbl 1264.94121 [16] Donoho, DL, De-noising by soft thresholding, IEEE Trans. Inf. Theory, 41, 613-627, (1995) · Zbl 0820.62002 [17] Donoho, DL, Compressed sensing, IEEE Trans. Inf. Theory, 52, 1289-1306, (2006) · Zbl 1288.94016 [18] Fan, J; Li, R, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc., 96, 1348-1360, (2001) · Zbl 1073.62547 [19] Gong, P., Zhang, C., Lu, Z., Huang, J., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: Proceedings of International Conference on Machine Learning Volume 28, pp. 37-45. (2013) · Zbl 1332.90280 [20] Gotoh, J; Uryasev, S, Two pairs of families of polyhedral norms versus $$ℓ _p$$-norms: proximity and applications in optimization, Math. Program., 156, 391-431, (2016) · Zbl 1352.47047 [21] Gulpinar, N; Thi, HA; Moeini, M, Robust investment strategies with discrete asset choice constraints using DC programming, Optimization, 59, 45-62, (2010) · Zbl 1188.90184 [22] Hempel, A.B., Goulart, P.J.: A novel method for modelling cardinality and rank constraints. In: IEEE Conference on Decision and Control, pp. 4322-4327. Los Angeles, USA, December (2014). http://control.ee.ethz.ch/index.cgi?page=publications;action=details;id=4712 · Zbl 0806.90114 [23] Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches, 3rd edn. Springer-Verlag, Berlin (1996) · Zbl 0867.90105 [24] Hu, Y; Zhang, D; Ye, J; Li, X; He, X, Fast and accurate matrix completion via truncated nuclear norm regularization, IEEE Trans. Pattern Anal. Mach. Intell., 35, 2117-2130, (2013) [25] Thi, HA; Pham Dinh, T, The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems, Ann. Oper. Res., 133, 23-46, (2005) · Zbl 1116.90122 [26] Thi, HA; Pham Dinh, T, DC approximation approaches for sparse optimization, Eur. J. Oper. Res., 244, 26-46, (2015) · Zbl 1346.90819 [27] Le Thi, HA; Pham Dinh, T; Muu, LD, Exact penalty in D.C. programming, Vietnam J. Math., 27, 169-178, (1999) · Zbl 1006.90062 [28] Thi, HA; Le, HM; Nguyen, VV; Pham Dinh, T, A DC programming approach for feature selection in support vector machines learning, Adv. Data Anal. Classif., 2, 259-278, (2008) · Zbl 1284.90057 [29] Thi, HA; Pham Dinh, T; Yen, NDJ, Properties of two DC algorithms in quadratic programming, J. Glob. Optim., 49, 481-495, (2011) · Zbl 1213.90189 [30] Thi, HA; Pham Dinh, T; Ngai, HVJ, Exact penalty and error bounds in DC programming, J. Glob. Optim., 52, 509-535, (2012) · Zbl 1242.49037 [31] Lu, C., Tang, J., Yan, S., Lin, Z.: Generalized nonconvex nonsmooth low-rank minimization. In: Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on, pp. 4130-4137. IEEE, (2014) [32] Ma, S; Goldfarb, D; Chen, L, Fixed point and Bregman iterative methods for matrix rank minimization, Math. Program., 128, 321-353, (2011) · Zbl 1221.65146 [33] Miyashiro, R., Takano, Y.: Mixed integer second-order cone programming formulations for variable selection in linear regression. Eur. J. Oper. Res. 247(3), 721-731 (2015a) · Zbl 1346.90616 [34] Miyashiro, R; Takano, Y, Subset selection by mallow’s $$C_p$$: a mixed integer programming approach, Expert Syst. Appl., 42, 325-331, (2015) [35] Moghaddam, B., Weiss, Y., Avidan, S.: Generalized spectral bounds for sparse lda. In: Proceedings of the 23rd International Conference on Machine learning, pp. 641-648. ACM, (2006) [36] Natarajan, BK, Sparse approximate solutions to linear systems, SIAM J. Comput., 24, 227-234, (1995) · Zbl 0827.68054 [37] Nesterov, Y, Gradient methods for minimizing composite objective function, Math. Program., 140, 125-161, (2013) · Zbl 1287.90067 [38] Nguyen, T.B.T., Le Thi, H.A., Le, H.M., Vo, X.T.: DC approximation approach for $$ℓ _0$$-minimization in compressed sensing. In: Le Thi, H.A., Nguyen, N.T., Do, T.V. (eds.) Advanced Computational Methods for Knowledge Engineering, pp. 37-48. Springer, Berlin (2015) · Zbl 1406.94010 [39] Nhat, P.D., Nguyen, M.C., Le Thi, H.A.: A DC programming approach for sparse linear discriminant analysis. In: Do, T.V., Le Thi, H.A., Nguyen, N.T. (eds.) Advanced Computational Methods for Knowledge Engineering, pp. 65-74. Springer, (2014) · Zbl 1327.90241 [40] Nocedal, Jorge, Wright, Stephen J.: Numerical Optimization 2nd. Springer, Berlin (2006) · Zbl 1104.65059 [41] Overton, ML; Womersley, RS, Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices, Math. Program., 62, 321-357, (1993) · Zbl 0806.90114 [42] Pavlikov, K; Uryasev, S, CVaR norm and applications in optimization, Optim. Lett., 8, 1999-2020, (2014) · Zbl 1332.90280 [43] Pham Dinh, T; Thi, HA, Convex analysis approach to D.C. programming: theory, algorithms and applications, Acta Math. Vietnam., 22, 289-355, (1997) · Zbl 0895.90152 [44] Pham Dinh, T; Thi, HA, A D.C. optimization algorithm for solving the trust-region subproblem, SIAM J. Optim., 8, 476-505, (1998) · Zbl 0913.65054 [45] Pham Dinh, T; Thi, HA, Recent advances in DC programming and DCA, Trans. Comput. Collect. Intell., 8342, 1-37, (2014) [46] Recht, B; Fazel, M; Parrilo, PA, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, SIAM Rev., 52, 471-501, (2010) · Zbl 1198.90321 [47] Shevade, SK; Keerthi, SS, A simple and efficient algorithm for gene selection using sparse logistic regression, Bioinformatics, 19, 2246-2253, (2003) [48] Smola, A.J., Vishwanathan, S.V.N., Hofmann, T.: Kernel methods for missing variables. In: Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, pp. 325-332. (2005) · Zbl 1219.90124 [49] Sriperumbudur, BK; Lanckriet, GRG, A proof of convergence of the concave-convex procedure using zangwill’s theory, Neural Comput., 24, 1391-1407, (2012) · Zbl 1254.90180 [50] Takeda, A; Niranjan, M; Gotoh, J; Kawahara, Y, Simultaneous pursuit of out-of-sample performance and sparsity in index tracking portfolios, Comput. Manag. Sci., 10, 21-49, (2013) · Zbl 1296.91257 [51] Thiao, M., Pham Dinh, T., Le Thi, H.A.: A DC programming approach for sparse eigenvalue problem. In: Proceedings of the 27th International Conference on Machine Learning, pp. 1063-1070. (2010) · Zbl 1203.91271 [52] Tibshirani, R, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B, 58, 267-288, (1996) · Zbl 0850.62538 [53] Toh, K-C; Yun, S, An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems, Pac. J. Optim., 6, 615-640, (2010) · Zbl 1205.90218 [54] Watson, GA, Linear best approximation using a class of polyhedral norms, Numer. Algorithms, 2, 321-335, (1992) · Zbl 0755.65011 [55] Watson, GA, On matrix approximation problems with Ky Fan $$k$$ norms, Numer. Algorithms, 5, 263-272, (1993) · Zbl 0791.65028 [56] Wu, B; Ding, C; Sun, DF; Toh, K-C, On the Moreau-Yosida regularization of the vector $$k$$-norm related functions, SIAM J. Optim., 24, 766-794, (2014) · Zbl 1297.90122 [57] Zhang, CH, Nearly unbiased variable selection under minimax concave penalty, Ann. Stat., 38, 894-942, (2010) · Zbl 1183.62120 [58] Zheng, X; Sun, X; Li, D; Sun, J, Successive convex approximations to cardinality-constrained convex programs: a piecewise-linear DC approach, Comput. Optim. Appl., 59, 379-397, (2014) · Zbl 1302.90157 [59] Zou, H; Hastie, T; Tibshirani, R, Sparse principal component analysis, J. Comput. Graph. stat., 15, 265-286, (2006)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.