A matheuristic for large-scale capacitated clustering. (English) Zbl 07485335

Summary: Clustering addresses the problem of assigning similar objects to groups. Since the size of the clusters is often constrained in practical clustering applications, various capacitated clustering problems have received increasing attention. We consider here the capacitated \(p\)-median problem (CPMP) in which \(p\) objects are selected as cluster centers (medians) such that the total distance from these medians to their assigned objects is minimized. Each object is associated with a weight, and the total weight in each cluster must not exceed a given capacity. Numerous exact and heuristic solution approaches have been proposed for the CPMP. The state-of-the-art approach performs well for instances with up to 5,000 objects but becomes computationally expensive for instances with a much larger number of objects. We propose a matheuristic with new problem decomposition strategies that can deal with instances comprising up to 500,000 objects. In a computational experiment, the proposed matheuristic consistently outperformed the state-of-the-art approach on medium- and large-scale instances while having similar performance for small-scale instances. As an extension, we show that our matheuristic can be applied to related capacitated clustering problems, such as the capacitated centered clustering problem (CCCP). For several test instances of the CCCP, our matheuristic found new best-known solutions.


90Bxx Operations research and management science


MNIST; AP-library
Full Text: DOI


[1] Ahmadi, S.; Osman, I. H., Density based problem space search for the capacitated clustering \(p\)-median problem, Ann. Oper. Res., 131, 1-4, 21-43 (2004) · Zbl 1067.90125
[2] Ahmadi, S.; Osman, I. H., Greedy random adaptive memory programming search for the capacitated clustering problem, European J. Oper. Res., 162, 1, 30-44 (2005) · Zbl 1132.90363
[3] Arthur, D., Vassilvitskii, S., \(2007. k\)-means \(+ +\): The advantages of careful seeding. In: Gabow, H. (Ed.), Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Philadelphia, Pennsylvania USA, pp. 1027-1035. · Zbl 1302.68273
[4] Avella, P.; Boccia, M.; Salerno, S.; Vasilyev, I., An aggregation heuristic for large scale \(p\)-median problem, Comput. Oper. Res., 39, 7, 1625-1632 (2012) · Zbl 1251.90234
[5] Bachem, O., Lucic, M., Hassani, S.H., Krause, A., 2016. Approximate \(k\)-means \(+ +\) in sublinear time. In: Schuurmans, D., Wellman, M. (Ed.), Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, Arizona USA, pp. 1459-1467.
[6] Baldacci, R.; Hadjiconstantinou, E.; Maniezzo, V.; Mingozzi, A., A new method for solving capacitated location problems based on a set partitioning approach, Comput. Oper. Res., 29, 4, 365-386 (2002) · Zbl 0994.90087
[7] Baumann, P., 2019. A binary linear programming-based \(K\)-means approach for the capacitated centered clustering problem. In: Wang, M., Li, J., Tsung, F., Yeung, A. (Eds.), 2019 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Macau, pp. 382-365.
[8] Baumann, P., 2020. A binary linear programming-based \(K\)-means algorithm for clustering with must-link and cannot-link constraints. In: 2020 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), pp. 324-328.
[9] Bentley, J. L., Multidimensional binary search trees used for associative searching, Commun. ACM, 18, 9, 509-517 (1975) · Zbl 0306.68061
[10] Boccia, M.; Sforza, A.; Sterle, C.; Vasilyev, I., A cut and branch approach for the capacitated \(p\)-median problem based on Fenchel cutting planes, J. Math. Model. Algorithms, 7, 1, 43-58 (2008) · Zbl 1170.90521
[11] Brimberg, J.; Mladenović, N.; Todosijević, R.; Urošević, D., Solving the capacitated clustering problem with variable neighborhood search, Ann. Oper. Res., 272, 1-2, 289-321 (2019) · Zbl 1409.90155
[12] Carrizosa, E.; Guerrero, V.; Morales, D. R., On mathematical optimization for the visualization of frequencies and adjacencies as rectangular maps, European J. Oper. Res., 265, 1, 290-302 (2018) · Zbl 1374.90271
[13] Ceselli, A.; Righini, G., A branch-and-price algorithm for the capacitated \(p\)-median problem, Netw.: Int. J., 45, 3, 125-142 (2005) · Zbl 1101.68722
[14] Chaves, A. A.; de Assis Correa, F.; Lorena, L. A.N., Clustering search heuristic for the capacitated \(p\)-median problem, (Corchado, E.; Corchado, J.; Abraham, A., Innovations in Hybrid Intelligent Systems (2007), Springer), 136-143
[15] Chaves, A. A.; Gonçalves, J. F.; Lorena, L. A.N., Adaptive biased random-key genetic algorithm with local search for the capacitated centered clustering problem, Comput. Ind. Eng., 124, 331-346 (2018)
[16] Deng, Y.; Bard, J. F., A reactive GRASP with path relinking for capacitated clustering, J. Heuristics, 17, 2, 119-152 (2011) · Zbl 1211.90301
[17] Díaz, J. A.; Fernandez, E., Hybrid scatter search and path relinking for the capacitated \(p\)-median problem, European J. Oper. Res., 169, 2, 570-585 (2006) · Zbl 1079.90076
[18] El-Alfy, E.-S. M., Applications of genetic algorithms to optimal multilevel design of MPLS-based networks, Comput. Commun., 30, 9, 2010-2020 (2007)
[19] Erkut, E.; Bozkaya, B., Analysis of aggregation errors for the \(p\)-median problem, Comput. Oper. Res., 26, 10-11, 1075-1096 (1999) · Zbl 0940.90055
[20] Espejo, I.; Puerto, J.; Rodríguez-Chía, A., A comparative study of different formulations for the capacitated discrete ordered median problem, Comput. Oper. Res., 125, Article 105067 pp. (2021) · Zbl 1458.90429
[21] Fleszar, K.; Hindi, K. S., An effective VNS for the capacitated \(p\)-median problem, European J. Oper. Res., 191, 3, 612-622 (2008) · Zbl 1160.90496
[22] Gnägi, M.; Strub, O., Tracking and outperforming large stock-market indices, Omega, 90, Article 101999 pp. (2020)
[23] Hansen, P.; Brimberg, J.; Urošević, D.; Mladenović, N., Solving large \(p\)-median clustering problems by primal-dual variable neighborhood search, Data Min. Knowl. Discov., 19, 3, 351-375 (2009)
[24] Jánošíková, L.; Herda, M.; Haviar, M., Hybrid genetic algorithms with selective crossover for the capacitated \(p\)-median problem, CEJOR Cent. Eur. J. Oper. Res., 25, 3, 651-664 (2017) · Zbl 1397.90413
[25] Kather, J. N.; Weis, C.-A.; Bianconi, F.; Melchers, S. M.; Schad, L. R.; Gaiser, T.; Marx, A.; Zöllner, F. G., Multi-class texture analysis in colorectal cancer histology, Sci. Rep., 6, 27988 (2016)
[26] KDD cup 2004, protein homology dataset (2004), Website. https://www.kdd.org/kdd-cup/view/kdd-cup-2004/Data. (Accessed: 2020-02-27)
[27] Koskosidis, Y. A.; Powell, W. B., Clustering algorithms for consolidation of customer orders into vehicle shipments, Transp. Res. B, 26, 5, 365-379 (1992)
[28] Kramer, R.; Iori, M.; Vidal, T., Mathematical models and search algorithms for the capacitated \(p\)-center problem, INFORMS J. Comput. (2019), To appear
[29] Landa-Torres, I.; Del Ser, J.; Salcedo-Sanz, S.; Gil-Lopez, S.; Portilla-Figueras, J. A.; Alonso-Garrido, O., A comparative study of two hybrid grouping evolutionary techniques for the capacitated \(p\)-median problem, Comput. Oper. Res., 39, 9, 2214-2222 (2012) · Zbl 1251.90329
[30] LeCun, Y.; Cortes, C.; Burges, C. J.C., The MNIST database of handwritten digits (1998), Website. http://yann.lecun.com/exdb/mnist/index.html. (Accessed: 2019-12-20)
[31] Lorena, L. A.N.; Senne, E. L.F., Local search heuristics for capacitated \(p\)-median problems, Netw. Spat. Econ., 3, 4, 407-419 (2003)
[32] Lorena, L. A.; Senne, E. L., A column generation approach to capacitated \(p\)-median problems, Comput. Oper. Res., 31, 6, 863-876 (2004) · Zbl 1048.90132
[33] Mai, F.; Fry, M. J.; Ohlmann, J. W., Model-based capacitated clustering with posterior regularization, European J. Oper. Res., 271, 2, 594-605 (2018) · Zbl 1403.90519
[34] Maniezzo, V.; Mingozzi, A.; Baldacci, R., A bionomic approach to the capacitated \(p\)-median problem, J. Heuristics, 4, 3, 263-280 (1998) · Zbl 0913.90201
[35] Medaglia, A. L.; Villegas, J. G.; Rodríguez-Coca, D. M., Hybrid biobjective evolutionary algorithms for the design of a hospital waste management network, J. Heuristics, 15, 2, 153 (2009) · Zbl 1176.90662
[36] Mulvey, J. M.; Beck, M. P., Solving capacitated clustering problems, European J. Oper. Res., 18, 3, 339-348 (1984) · Zbl 0547.62039
[37] Negreiros, M.; Palhano, A., The capacitated centred clustering problem, Comput. Oper. Res., 33, 6, 1639-1663 (2006) · Zbl 1087.90088
[38] Osman, I. H.; Ahmadi, S., Guided construction search metaheuristics for the capacitated \(p\)-median problem with single source constraint, J. Oper. Res. Soc., 58, 1, 100-114 (2007) · Zbl 1152.90509
[39] Osman, I. H.; Christofides, N., Capacitated clustering problems by hybrid simulated annealing and tabu search, Int. Trans. Oper. Res., 1, 3, 317-336 (1994) · Zbl 0857.90107
[40] Pirkul, H., Efficient algorithms for the capacitated concentrator location problem, Comput. Oper. Res., 14, 3, 197-208 (1987) · Zbl 0617.90019
[41] Puerto, J.; Rodríguez-Madrena, M.; Scozzari, A., Clustering and portfolio selection problems: A unified framework, Comput. Oper. Res., 117, Article 104891 pp. (2020) · Zbl 1458.91197
[42] Reinelt, G., TSPLIB (1991), Website. http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/index.html. (Accessed: 2019-12-20)
[43] Ríos-Mercado, R. Z.; Álvarez Socarrás, A. M.; Castrillón, A.; López-Locés, M. C., A location-allocation-improvement heuristic for districting with multiple-activity balancing constraints and \(p\)-median-based dispersion minimization, Comput. Oper. Res., 126, Article 105106 pp. (2021) · Zbl 07350510
[44] Rohe, A., VLSI collection (2013), Website. http://www.math.uwaterloo.ca/tsp/vlsi/index.html. (Accessed: 2019-12-20)
[45] Scheuerer, S.; Wendolsky, R., A scatter search heuristic for the capacitated clustering problem, European J. Oper. Res., 169, 2, 533-547 (2006) · Zbl 1079.90180
[46] Senne, E. L.; Lorena, L. A., LagrangeAn/surrogate heuristics for \(p\)-median problems, (Laguna, M.; Gonzalez-Velarde, J., Computing Tools for Modeling, Optimization and Simulation (2000), Springer), 115-130
[47] Stefanello, F.; de Araújo, O. C.; Müller, F. M., Matheuristics for the capacitated \(p\)-median problem, Int. Trans. Oper. Res., 22, 1, 149-167 (2015) · Zbl 1309.90060
[48] Ushakov, A. V.; Vasilyev, I., A computational comparison of parallel and distributed \(K\)-median clustering algorithms on large-scale image data, (Bykadorov, I.; Strusevich, V.; Tchemisova, T., International Conference on Mathematical Optimization Theory and Operations Research (2019)), 119-130
[49] Yaghini, M.; Karimi, M.; Rahbar, M., A hybrid metaheuristic approach for the capacitated \(p\)-median problem, Appl. Soft Comput., 13, 9, 3922-3930 (2013)
[50] Zhou, Q.; Benlic, U.; Wu, Q.; Hao, J.-K., Heuristic search to the capacitated clustering problem, European J. Oper. Res., 273, 2, 464-487 (2019) · Zbl 1403.90661
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.