×

Complex network partitioning using label propagation. (English) Zbl 1386.68214

Summary: We present PuLP (partitioning using label propagation), a parallel and memory-efficient graph partitioning method specifically designed to partition low-diameter networks with skewed degree distributions on shared-memory multicore platforms. Graph partitioning is an important problem in scientific computing because it impacts the execution time and energy efficiency of computations on distributed-memory platforms. Partitioning determines the in-memory layout of a graph, which affects locality, intertask load balance, communication time, and overall memory utilization. A novel feature of our PuLP method is that it optimizes for multiple objective metrics simultaneously, while satisfying multiple partitioning constraints. Using our method, we are able to partition a web crawl with billions of edges on a single compute server in under a minute. For a collection of test graphs, we show that PuLP uses up to \(7.8\times\) less memory than state-of-the-art partitioners and is \(5.0\times\) faster, on average, than alternate approaches (with 16-way parallelism). We also achieve better partitioning quality results for the multiobjective scenario.

MSC:

68W10 Parallel algorithms in computer science
05C70 Edge subsets with special properties (factorization, matching, partitioning, covering and packing, etc.)
05C82 Small world graphs, complex networks (graph-theoretic aspects)
05C85 Graph algorithms (graph-theoretic aspects)
68R10 Graph theory (including graph drawing) in computer science
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] D. A. Bader, H. Meyerhenke, P. Sanders, C. Schulz, A. Kappes, and D. Wagner, {\it Benchmarking for graph clustering and partitioning}, in Encyclopedia of Social Network Analysis and Mining, Springer, NY, 2014, pp. 73-82.
[2] P. Boldi, B. Codenotti, M. Santini, and S. Vigna, {\it UbiCrawler: A scalable fully distributed web crawler}, Software Practice Experience, 34 (2004), pp. 711-726.
[3] P. Boldi, M. Rosa, M. Santini, and S. Vigna, {\it Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks}, in Proceedings of the 20th International Conference on World Wide Web, ACM Press, New York, 2011.
[4] P. Boldi and S. Vigna, {\it The WebGraph framework I: Compression techniques}, in Proceedings of the 13th International World Wide Web Conference (WWW 2004), ACM Press, New York, 2004. pp. 595-601.
[5] E. G. Boman, K. D. Devine, and S. Rajamanickam, {\it Scalable matrix computations on large scale-free graphs using 2D graph partitioning}, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, 2013.
[6] Ü. V. Catalyürek, M. Deveci, K. Kaya, and B. Uçar, {\it UMPa: A multi-objective, multi-level partitioner for communication minimization}, Contemp. Math., 588 (2013). · Zbl 1269.05080
[7] A. Ching and C. Kunz, {\it Giraph: Large-scale graph processing infrastructure on Hadoop}, in Proceedings of the Hadoop Summit, 2011, p. 2011.
[8] T. A. Davis and Y. Hu, {\it The University of Florida sparse matrix collection}, ACM Trans. Math. Software, 38 (2011), pp. 1-25. · Zbl 1365.65123
[9] C. M. Fiduccia and R. M. Mattheyses, {\it A linear-time heuristic for improving network partitions}, in Proceedings of the Conference on Design Automation, 1982.
[10] A. George and J. W. Liu, {\it Computer Solutions of Large Sparse Positive Definite} Systems, Prentice-Hall, Englewood Cliffs, NJ, 1981. · Zbl 0516.65010
[11] J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, {\it PowerGraph: Distributed graph-parallel computation on natural graphs}, in Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI), 2012.
[12] Y. Guo, M. Biczak, A. L. Varbanescu, A. Iosup, C. Martella, and T. L. Willke, {\it Towards benchmarking graph-processing platforms}, in Proceedings of Supercomputing (SC), 2013.
[13] U. Kang, C. E. Tsourakakis, and C. Faloutsos, {\it PEGASUS: A peta-scale graph mining system implementation and observations}, in Proceedings of the IEEE International Conference on Data Mining (ICDM), 2009.
[14] G. Karypis and V. Kumar, {\it MeTis: A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-Reducing Orderings of Sparse Matrices. Version} 5.1.0, (2014).
[15] G. Karypis and V. Kumar, {\it Parallel multilevel K-way partitioning scheme for irregular graphs}, in Proceedings of the ACM/IEEE Conference on Supercomputing, 1996. · Zbl 0918.68073
[16] G. Karypis and V. Kumar, {\it A fast and high quality multilevel scheme for partitioning irregular graphs}, SIAM J. Sci. Comput., 20 (1998), pp. 359-392. · Zbl 0915.68129
[17] G. Karypis and V. Kumar, {\it Multilevel algorithms for multi-constraint graph partitioning}, in Proceedings of the ACM/IEEE Conference on Supercomputing, 1998.
[18] J. Kunegis, {\it KONECT — The Koblenz Network Collection}, (2014).
[19] J. Leskovec, K. Lang, A. Dasgupta, and M. Mahoney, {\it Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters}, Internet Math., 6 (2009), pp. 29-123. · Zbl 1205.91144
[20] C. Martella, D. Logothetis, A. Loukas, and G. Siganos, {\it Spinner: Scalable Graph Partitioning in the Cloud}, preprint, arXiv:1404.3861, 2014.
[21] H. Meyerhenke, P. Sanders, and C. Schulz, {\it Partitioning complex networks via size-constrained clustering}, in Proceedings of Experimental Algorithms — 13th International Symposium, Copenhagen, Denmark, 2014, pp. 351-363. · Zbl 1360.90305
[22] H. Meyerhenke, P. Sanders, and C. Schulz, {\it Parallel graph partitioning for complex networks}, in Proceedings of the IEEE International Parallel and Distributed Processing Symposium, IPDPS 2015, Hyderabad, India, 2015, pp. 1055-1064.
[23] R. Pearce, M. Gokhale, and N. M. Amato, {\it Scaling techniques for massive scale-free graphs in distributed (external) memory}, in Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2013.
[24] A. Pinar and B. Hendrickson, {\it Partitioning for complex objectives}, in Proceedings of the 15th International Parallel and Distributed Processing Symposium, IPDPS ’01, Washington, DC, 2001.
[25] L. Quick, P. Wilkinson, and D. Hardcastle, {\it Using Pregel-like large scale graph processing frameworks for social network analysis}, in Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2012.
[26] U. N. Raghavan, R. Albert, and S. Kumara, {\it Near linear time algorithm to detect community structures in large-scale networks}, Phys. Rev. E, 76 (2007), 036106.
[27] K. Schloegel, G. Karypis, and V. Kumar, {\it Parallel multilevel algorithms for multi-constraint graph partitioning}, in Proceedings of Euro-Par 2000, International European Conference on Parallel and Distributed Computing, 2000. · Zbl 1012.68146
[28] B. Shao, H. Wang, and Y. Li, {\it Trinity: A distributed graph engine on a memory cloud}, in Proceedings of the ACM International Conference on Management of Data (SIGMOD), 2013.
[29] G. M. Slota, K. Madduri, and S. Rajamanickam, {\it PULP: Scalable multi-objective multi-constraint partitioning for small-world networks}, in Proceedings of the IEEE Conference on Big Data (BigData 2014), 2014.
[30] G. M. Slota, S. Rajamanickam, and K. Madduri, {\it BFS and coloring-based parallel algorithms for strongly connected components and related problems}, in Proceedings of the IEEE International Parallel and Distributed Processing Symposiom (IPDPS), 2014.
[32] B. Uçar and C. Aykanat, {\it Encapsulating multiple communication-cost metrics in partitioning sparse rectangular matrices for parallel matrix-vector multiplies}, SIAM J. Sci. Comput., 25 (2004), pp. 1837-1859. · Zbl 1061.65036
[33] J. Ugander and L. Backstrom, {\it Balanced label propagation for partitioning massive graphs}, in Proceedings of the Web Search and Data Mining (WSDM), 2013.
[34] L. Vaquero, F. Cuadrado, D. Logothetis, and C. Martella, {\it xDGP: A Dynamic Graph Processing System with Adaptive Partitioning}, CoRR abs/1309.1049, 2013.
[35] L. Wang, Y. Xiao, B. Shao, and H. Wang, {\it How to partition a billion-node graph}, in Proceedings of the IEEE 30th International Conference on Data Engineering, 2014, pp. 568-579.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.