zbMATH — the first resource for mathematics

Beyond good partition shapes: an analysis of diffusive graph partitioning. (English) Zbl 1257.05131
Summary: In this paper we study the prevalent problem of graph partitioning by analyzing the diffusion-based partitioning heuristic Bubble-FOS/C, a key component of a practical successful graph partitioner [H. Meyerhenke, B. Monien and T. Sauerwald, J. Parallel Distrib. Comput. 69, No. 9, 750–761 (2009)].
We begin by studying the disturbed diffusion scheme FOS/C, which computes the similarity measure used in Bubble-FOS/C and is therefore the most crucial component. By relating FOS/C to random walks, we obtain precise characterizations of the behavior of FOS/C on tori and hypercubes. Besides leading to new knowledge on FOS/C (and therefore also on Bubble-FOS/C), these characterizations have been recently used for the analysis of load balancing algorithms [P. Berenbrink et al., in: Proceedings of the 22nd Annual Symposium on Discrete Algorithms, 429–439 (2011)].
We then regard Bubble-FOS/C, which has been shown in previous experiments to produce solutions with good partition shapes and other favorable properties. In this paper we prove that it computes a relaxed solution to an edge cut minimizing binary quadratic program (BQP). This result provides the first substantial theoretical insight why Bubble-FOS/C yields good experimental results in terms of graph partitioning metrics. Moreover, we show that in bisections computed by Bubble-FOS/C, at least one of the two parts is connected. Using the aforementioned relation between FOS/C and random walks, we prove that in vertex-transitive graphs both parts must be connected components.
05C70 Edge subsets with special properties (factorization, matching, partitioning, covering and packing, etc.)
05C81 Random walks on graphs
05C90 Applications of graph theory
05C85 Graph algorithms (graph-theoretic aspects)
68R10 Graph theory (including graph drawing) in computer science
Full Text: DOI
[1] Alon, N., Spencer, J.H.: The Probabilistic Method, 2nd edn. Wiley, New York (2000)
[2] Andersen, R., Chung, F.R.K., Lang, K.J.: Local graph partitioning using pagerank vectors. In: Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS’06), pp. 475–486 (2006)
[3] Andersen, R., Peres, Y.: Finding sparse cuts locally using evolving sets. In: Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC’09), pp. 235–244. ACM, New York (2009) · Zbl 1304.05128
[4] Andreev, K., Räcke, H.: Balanced graph partitioning. Theory Comput. Syst. 39(6), 929–939 (2006) · Zbl 1113.68069 · doi:10.1007/s00224-006-1350-7
[5] Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming. Theory and Algorithms, 2nd edn. Wiley, New York (1993) · Zbl 0774.90075
[6] Berenbrink, P., Cooper, C., Friedetzky, T., Friedrich, T., Sauerwald, T.: Randomized diffusion for indivisible loads. In: Proceedings of the 22nd Annual Symposium on Discrete Algorithms (SODA’11), pp. 429–439 (2011) · Zbl 1410.68028
[7] Biggs, N.: Algebraic Graph Theory. Cambridge University Press, Cambridge (1993) · Zbl 0805.68094
[8] Chevalier, C., Pellegrini, F.: PT-Scotch: A tool for efficient parallel graph ordering. Parallel Comput. 34(6–8), 318–331 (2008) · doi:10.1016/j.parco.2007.12.001
[9] Coifman, R.R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric diffusions as a tool for harmonic analysis and structure definition of data. Parts I and II. Proc. Natl. Acad. Sci. USA 102(21), 7426–7437 (2005) · doi:10.1073/pnas.0500334102
[10] Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib. Comput. 7, 279–301 (1989) · doi:10.1016/0743-7315(89)90021-X
[11] Dhillon, I.S., Guan, Y., Kulis, B.: Weighted graph cuts without eigenvectors: A multilevel approach. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 1944–1957 (2007) · Zbl 05341050 · doi:10.1109/TPAMI.2007.1115
[12] Diaconis, P., Graham, R.L., Morrison, J.A.: Asymptotic analysis of a random walk on a hypercube with many dimensions. Random Struct. Algorithms 1(1), 51–72 (1990) · Zbl 0723.60085 · doi:10.1002/rsa.3240010105
[13] Diekmann, R., Frommer, A., Monien, B.: Efficient schemes for nearest neighbor load balancing. Parallel Comput. 25(7), 789–812 (1999) · Zbl 0942.90023 · doi:10.1016/S0167-8191(99)00018-6
[14] Doyle, P.G., Snell, J.L.: Random Walks and Electric Networks. Math. Assoc. of America, Washington (1984) · Zbl 0583.60065
[15] Feldmann, A.E., Foschini, L.: Balanced partitions of trees and applications. In: Proceedings of the 29th International Symposium on Theoretical Aspects of Computer Science, STACS 2012, pp. 100–111 (2012) · Zbl 1245.68248
[16] Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslov. Math. J. 25, 619–633 (1975) · Zbl 0437.15004 · doi:10.1007/BF01591018
[17] Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, New York (1979) · Zbl 0411.68039
[18] Godsil, C., Royle, G.: Algebraic Graph Theory. Springer, Berlin (2001) · Zbl 0968.05002
[19] Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins Univ. Press, Baltimore (1996) · Zbl 0865.65009
[20] Grady, L.: Space-variant computer vision: a graph-theoretic approach. PhD thesis, Boston University, Boston, MA (2004)
[21] Grimmett, G.R., Stirzaker, D.R.: Probability and Random Processes, 3rd edn. Oxford University Press, Oxford (2001) · Zbl 1015.60002
[22] Hendrickson, B., Leland, R.: An improved spectral graph partitioning algorithm for mapping parallel computations. SIAM J. Sci. Comput. 16(2), 452–469 (1995) · Zbl 0816.68093 · doi:10.1137/0916028
[23] Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput. 48(1), 96–129 (1998) · Zbl 0918.68073 · doi:10.1006/jpdc.1997.1404
[24] Kaufmann, H., Pape, H.: Clusteranalyse. In: Fahrmeir, L., Hamerle, A., Tutz, G. (eds.) Multivariate statistische Verfahren 2nd edn. de Gruyter, Berlin (1996)
[25] Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Springer, Berlin (1976) · Zbl 0328.60035
[26] Kernighan, B.W., Lin, S.: An efficient heuristic for partitioning graphs. Bell Syst. Tech. J. 49, 291–308 (1970) · Zbl 0333.05001 · doi:10.1002/j.1538-7305.1970.tb01770.x
[27] Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. San Mateo, Morgan Kaufmann (1992) · Zbl 0743.68007
[28] Lloyd, S.P.: Least squares quantization in PCM. IEEE Trans. Inf. Theory 28(2), 129–136 (1982) · Zbl 0504.94015 · doi:10.1109/TIT.1982.1056489
[29] Lovász, L.: Random walks on graphs: a survey. Combinatorics 2, 1–46 (1993)
[30] Meila, M., Shi, J.: A random walks view of spectral segmentation. In: 8th International Workshop on Artificial Intelligence and Statistics (AISTATS) (2001)
[31] Meyerhenke, H.: Disturbed diffusive processes for solving partitioning problems on graphs. PhD thesis, Universität Paderborn (2008) · Zbl 1408.05003
[32] Meyerhenke, H.: Beyond good shapes: Diffusion-based graph partitioning is relaxed cut optimization. In: Proceedings of the 21st International Symposium on Algorithms and Computation (ISAAC’10), Part II. Lecture Notes in Computer Science, vol. 6507, pp. 387–398. Springer, Berlin (2010) · Zbl 1310.68167
[33] Meyerhenke, H., Monien, B., Sauerwald, T.: A new diffusion-based multilevel algorithm for computing graph partitions. J. Parallel Distrib. Comput. 69(9), 750–761 (2009) Best Paper Awards and Panel Summary: IPDPS 2008 · doi:10.1016/j.jpdc.2009.04.005
[34] Meyerhenke, H., Monien, B., Schamberger, S.: Graph partitioning and disturbed diffusion. Parallel Comput. 35(10–11), 544–569 (2009) · doi:10.1016/j.parco.2009.09.006
[35] Meyerhenke, H., Sauerwald, T.: Analyzing disturbed diffusion on networks. In: Proceedings of the 17th International Symposium on Algorithms and Computation (ISAAC’06), pp. 429–438. Springer, Berlin (2006) · Zbl 1135.68522
[36] Nadler, B., Lafon, S., Coifman, R.R., Kevrekidis, I.G.: Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Proceedings of Advances in Neural Information Processing Systems 18 (NIPS’05) (2005)
[37] Pellegrini, F.: A parallelisable multi-level banded diffusion scheme for computing balanced partitions with smooth boundaries. In: Proceedings of the 13th International Euro-Par Conference (EURO-PAR’07). Lecture Notes in Computer Science, vol. 4641, pp. 195–204. Springer, Berlin (2007)
[38] Rabani, Y., Sinclair, A., Wanka, R.: Local divergence of Markov chains and the analysis of iterative load balancing schemes. In: Proceedings of the 39th Annual Symposium on Foundations of Computer Science (FOCS’98), pp. 694–705 (1998)
[39] Räcke, H.: Optimal hierarchical decompositions for congestion minimization in networks. In: Proc. 40th Annual ACM Symposium on Theory of Computing, Victoria, British Columbia, Canada, May 17–20, 2008, pp. 255–264 (2008) · Zbl 1231.68051
[40] Saerens, M., Fouss, F., Yen, L., Dupont, P.: The principal components analysis of a graph, and its relationship to spectral clustering. In: Proceedings of the 15th European Conference on Machine Learning (ECML’04), pp. 371–383 (2004) · Zbl 1132.68589
[41] Schaeffer, S.E.: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007) · Zbl 1302.68237 · doi:10.1016/j.cosrev.2007.05.001
[42] Schloegel, K., Karypis, G., Kumar, V.: Graph partitioning for high performance scientific simulations. In: The Sourcebook of Parallel Computing, pp. 491–541. San Mateo, Morgan Kaufmann (2003)
[43] Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000) · Zbl 05111961 · doi:10.1109/34.868688
[44] The BlueGene/L Team: An overview of the BlueGene/L supercomputer. In: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, pp. 1–22. ACM, New York (2002)
[45] Tishby, N., Slonim, N.: Data clustering by Markovian relaxation and the information bottleneck method. In: Proceedings of Advances in Neural Information Processing Systems 13 (NIPS), pp. 640–646 (2000)
[46] Trefethen, L.N., Bau, D.: Numerical Linear Algebra. Philadelphia, SIAM (1997) · Zbl 0874.65013
[47] Trottenberg, U., Oosterlee, C.W., Schüller, A.: Multigrid. Academic Press, San Diego (2000)
[48] van Dongen, S.: Graph clustering by flow simulation. PhD thesis, University of Utrecht (2000)
[49] Walshaw, C.: The graph partitioning archive. http://staffweb.cms.gre.ac.uk/\(\sim\)c.walshaw/partition/ (2010). Last access: 31 May 2012
[50] Xu, C., Lau, F.C.M.: Load Balancing in Parallel Computers. Kluwer, Dordrecht (1997)
[51] Yen, L., Vanvyve, D., Wouters, F., Fouss, F., Verleysen, M., Saerens, M.: Clustering using a random-walk based distance measure. In: Proceedings of the 13th European Symposium on Artificial Neural Networks (ESANN’05), pp. 317–324 (2005)
[52] Zha, H., He, X., Ding, C.H.Q., Gu, M., Simon, H.D.: Spectral relaxation for k-means clustering. In: Proceedings of Advances in Neural Information Processing Systems 14 (NIPS), pp. 1057–1064. MIT Press, Cambridge (2001)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.