×

Parallel subgradient algorithm with block dual decomposition for large-scale optimization. (English) Zbl 1495.65086

Summary: This paper studies computational approaches for solving large-scale optimization problems using a Lagrangian dual reformulation, solved by parallel sub-gradient methods. Since there are many possible reformulations for a given problem, an important question is: Which reformulation leads to the fastest solution time? One approach is to detect a block diagonal structure in the constraint matrix, and reformulate the problem by dualizing the constraints outside of the blocks; the approach is defined herein as block dual decomposition. Main advantage of such a reformulation is that the Lagrangian relaxation has a block diagonal constraint matrix, thus decomposable into smaller sub-problems that can solved in parallel. We show that the block decomposition can critically affect convergence rate of the sub-gradient method. We propose various decomposition methods that use domain knowledge or apply algorithms using knowledge about the structure in the constraint matrix or the dependence in the decision variables, towards reducing the computational effort to solve large-scale optimization problems. In particular, we introduce a block decomposition approach that reduces the number of dualized constraints by utilizing a community detection algorithm. We present empirical experiments on an extensive set of problem instances including a real application. We illustrate that if the number of the dualized constraints in the decomposition increases, the computational effort within each iteration of the sub-gradient method decreases while the number of iterations required for convergence increases. The key message is that it is crucial to employ prior knowledge about the structure of the problem when solving large scale optimization problems using dual decomposition.

MSC:

65K05 Numerical mathematical programming methods
90C06 Large-scale problems in mathematical programming
90C25 Convex programming
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Androulakis, I. P.; Visweswaran, V.; Floudas, C. A., Distributed decomposition-based approaches in global optimization, State of the art in global optimization: Computational methods and applications, 285-301) (1996), Springer US: Springer US Boston, MA · Zbl 0871.90068
[2] Aykanat, C.; Pinar, A.; Çatalyürek, Ü. V., Permuting sparse rectangular matrices into block-diagonal form, SIAM Journal on Scientific Computing, 25, 6, 1860-1879 (2004) · Zbl 1070.65027
[3] Bergner, M.; Caprara, A.; Ceselli, A.; Furini, F.; Lübbecke, M. E.; Malaguti, E.; Traversi, E., Automatic Dantzig-Wolfe reformulation of mixed integer programs, Mathematical Programming, 149, 1-2, 391-424 (2015) · Zbl 1307.90114
[4] Bertsekas, D. P., Nonlinear programming (1995), Athena Scientific: Athena Scientific Belmont, Mass. · Zbl 0935.90037
[5] Bertsekas, D. P., Incremental gradient, subgradient, and proximal methods for convex optimization: A survey, Optimization for Machine Learning, 2010, 1-38, 3 (2011)
[6] Bezanson, J., Karpinski, S., Shah, V. B., & Edelman, A. (2012). Julia: A fast dynamic language for technical computing. arXiv preprint arXiv:1209.5145,. · Zbl 1356.68030
[7] Boyd, S. P., Subgradient methods (2014), LectureNotes
[8] Boyd, S. P.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J., Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine Learning, 3, 1, 1-122 (2011) · Zbl 1229.90122
[9] Brandes, U.; Delling, D.; Gaertler, M.; Gorke, R.; Hoefer, M.; Nikoloski, Z.; Wagner, D., On modularity clustering, IEEE Transactions on Knowledge and Data Engineering, 20, 2, 172-188 (2007)
[10] Bui, T. N.; Jones, C., Finding good approximate vertex and edge partitions is NP-hard, Information Processing Letters, 42, 3, 153-159 (1992) · Zbl 0764.68061
[11] Campegiani, P.; Presti, F. L., A general model for virtual machines resources allocation in multi-tier distributed systems, Proceedings of the fifth international conference on autonomic and autonomous systems, 162-167 (2009), IEEE
[12] Camponogara, E.; De Oliveira, L. B., Distributed optimization for model predictive control of linear-dynamic networks, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 39, 6, 1331-1338 (2009)
[13] Carøe, C. C.; Schultz, R., Dual decomposition in stochastic integer programming, Operations Research Letters, 24, 1-2, 37-45 (1999) · Zbl 1063.90037
[14] Chu, P. C.; Beasley, J. E., A genetic algorithm for the multidimensional knapsack problem, Journal of Heuristics, 4, 1, 63-86 (1998) · Zbl 0913.90218
[15] Clauset, A.; Newman, M. E.J.; Moore, C., Finding community structure in very large networks, Physical Review E, 70, 6, 066111 (2004)
[16] Duchi, J.; Hazan, E.; Singer, Y., Adaptive subgradient methods for online learning and stochastic optimization, Journal of Machine Learning Research, 12, Jul, 2121-2159 (2011) · Zbl 1280.68164
[17] Duchi, J. C.; Agarwal, A.; Wainwright, M. J., Dual averaging for distributed optimization: Convergence analysis and network scaling, IEEE Transactions on Automatic Control, 57, 3, 592-606 (2012) · Zbl 1369.90156
[18] Ferris, M. C.; Horn, J. D., Partitioning mathematical programs for parallel solution, Mathematical Programming, 80, 1, 35-61 (1998) · Zbl 0894.90108
[19] Fortunato, S., Community detection in graphs, Physics Reports, 486, 3, 75-174 (2010)
[20] Frey, B. J.; Dueck, D., Clustering by passing messages between data points, Science, 315, 5814, 972-976 (2007) · Zbl 1226.94027
[21] Gentili, M.; Serban, N.; Harati, P.; O’Connor, J.; Swann, J., Quantifying disparities in accessibility and availability of pediatric primary care with implications for policy, Health Services Research, 53, 3, 1458-1477 (2017)
[22] Girvan, M.; Newman, M. E., Community structure in social and biological networks, Proceedings of the National Academy of Sciences, 99, 12, 7821-7826 (2002) · Zbl 1032.91716
[23] Goffin, J., On convergence rates of subgradient optimization methods, Mathematical Programming, 13, 1, 329-347 (1977) · Zbl 0368.90119
[24] Holmberg, K.; Yuan, D., A Lagrangian heuristic based branch-and-bound approach for the capacitated network design problem, Operations Research, 48, 3, 461-481 (2000) · Zbl 1106.90381
[25] Hromkovič, J., Communication complexity and parallel computing (2013), Springer Science & Business Media
[26] Inalhan, G.; Stipanovic, D. M.; Tomlin, C. J., Decentralized optimization, with application to multiple aircraft coordination, Proceedings of the 41st IEEE conference on decision and control, 2002., vol. 1, 1147-1155 (2002)
[27] Khaniyev, T.; Elhedhli, S.; Erenay, F. S., Structure detection in mixed-integer programs, INFORMS Journal on Computing, 30, 3, 570-587 (2018) · Zbl 1528.90161
[28] Knobe, K.; Lukas, J. D.; Steele, G. L., Data optimization: Allocation of arrays to reduce communication on SIMD machines, Journal of parallel and Distributed Computing, 8, 2, 102-118 (1990)
[29] Lyaudet, L., NP-hard and linear variants of hypergraph partitioning, Theoretical Computer Science, 411, 1, 10-21 (2010) · Zbl 1189.68086
[30] Maher, S. J., Implementing the branch-and-cut approach for a general purpose Benders decomposition framework, European Journal of Operational Research, 290, 2, 479-498 (2021) · Zbl 1487.90487
[31] Martin, R. K., Large scale linear and integer optimization: A unified approach (1999), Springer Science & Business Media · Zbl 1053.90001
[32] Medhi, D., Parallel bundle-based decomposition for large-scale structured mathematical programming problems, Annals of Operations Research, 22, 1, 101-127 (1990) · Zbl 0714.90079
[33] Nedic, A.; Ozdaglar, A., Approximate primal solutions and rate analysis for dual subgradient methods, SIAM Journal on Optimization, 19, 4, 1757-1780 (2009) · Zbl 1191.90037
[34] Nedic, A.; Ozdaglar, A., Distributed subgradient methods for multi-agent optimization, IEEE Transactions on Automatic Control, 54, 1, 48-61 (2009) · Zbl 1367.90086
[35] Nesterov, Y., Smooth minimization of non-smooth functions, Mathematical Programming, 103, 1, 127-152 (2005) · Zbl 1079.90102
[36] Newman, M. E.J., Fast algorithm for detecting community structure in networks, Physical Review E, 69, 6, 066133 (2004)
[37] Newman, M. E.J., Modularity and community structure in networks, Proceedings of the National Academy of Sciences, 103, 23, 8577-8582 (2006)
[38] Newman, M. E.J.; Girvan, M., Finding and evaluating community structure in networks, Physical Review E, 69, 2, 026113 (2004)
[39] Nowak, R. D., Distributed em algorithms for density estimation and clustering in sensor networks, IEEE Transactions on Signal Processing, 51, 8, 2245-2253 (2003)
[40] Palomar, D. P.; Mung, C., A tutorial on decomposition methods for network utility maximization, IEEE Journal on Selected Areas in Communications, 24, 8, 1439-1451 (2006)
[41] Parikh, N.; Boyd, S. P., Block splitting for distributed optimization, Mathematical Programming Computation, 6, 1, 77-102 (2014) · Zbl 1305.90291
[42] Raffard, R. L.; Tomlin, C. J.; Boyd, S. P., Distributed optimization for cooperative agents: Application to formation flight, Proceedings of the 43rd IEEE conference on decision and control, 2004 CDC, vol. 3, 2453-2459 (2004), IEEE
[43] Rehfeldt, D.; Hobbie, H.; Schönheit, D.; Koch, T.; Möst, D.; Gleixner, A., A massively parallel interior-point solver for LPs with generalized arrowhead structure, and applications to energy system models, European Journal of Operational Research, in press (2021)
[44] Richtárik, P.; Takáč, M., Parallel coordinate descent methods for big data optimization, Mathematical Programming, 156, 1-2, 433-484 (2016) · Zbl 1342.90102
[45] Rodriguez, A.; Laio, A., Clustering by fast search and find of density peaks, Science, 344, 6191, 1492-1496 (2014)
[46] Shastri, Y.; Hansen, A.; Rodríguez, L.; Ting, K. C., A novel decomposition and distributed computing approach for the solution of large scale optimization models, Computers and Electronics in Agriculture, 76, 1, 69-79 (2011)
[47] Simonetto, A.; Jamali-Rad, H., Primal recovery from consensus-based dual decomposition for distributed convex optimization, Journal of Optimization Theory and Applications, 168, 1, 172-197 (2016) · Zbl 1332.90203
[48] Terelius, H.; Topcu, U.; Murray, R. M., Decentralized multi-agent optimization via dual decomposition, IFAC Proceedings Volumes, 44, 1, 11245-11251 (2011)
[49] Wolfe, J.; Haghighi, A.; Klein, D., Fully distributed em for very large datasets, Proceedings of the 25th international conference on machine learning, 1184-1191 (2008), ACM
[50] Wright, S. J., Coordinate descent algorithms, Mathematical Programming, 151, 1, 3-34 (2015) · Zbl 1317.49038
[51] Xiao, L.; Johansson, M.; Boyd, S. P., Simultaneous routing and resource allocation via dual decomposition, IEEE Transactions on Communications, 52, 7, 1136-1144 (2004)
[52] Xu, D.; Tian, Y., A comprehensive survey of clustering algorithms, Annals of Data Science, 2, 2, 165-193 (2015)
[53] Xu, R.; Wunsch, D., Survey of clustering algorithms, IEEE Transactions on Neural Networks, 16, 3, 645-678 (2005)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.