##
**Parallel algorithms for bipartite matching problems on distributed memory computers.**
*(English)*
Zbl 1248.65145

Summary: We present a new parallel algorithm for computing a maximum cardinality matching in a bipartite graph suitable for distributed memory computers.

The presented algorithm is based on the Push-Relabel algorithm which is known to be one of the fastest algorithms for the bipartite matching problem. Previous attempts at developing parallel implementations of it have focused on shared memory computers using only a limited number of processors.

We first present a straightforward adaptation of these shared memory algorithms to distributed memory computers. However, this is not a viable approach as it requires too much communication. We then develop our new algorithm by modifying the previous approach through a sequence of steps with the main goal being to reduce the amount of communication and to increase load balance. The first goal is achieved by changing the algorithm so that many push and relabel operations can be performed locally between communication rounds and also by selecting augmenting paths that cross processor boundaries infrequently. To achieve good load balance, we limit the speed at which global relabelings traverse the graph. In several experiments on a large number of instances, we study weak and strong scalability of our algorithm using up to 128 processors.

The algorithm can also be used to find \(\epsilon \)-approximate matchings quickly.

The presented algorithm is based on the Push-Relabel algorithm which is known to be one of the fastest algorithms for the bipartite matching problem. Previous attempts at developing parallel implementations of it have focused on shared memory computers using only a limited number of processors.

We first present a straightforward adaptation of these shared memory algorithms to distributed memory computers. However, this is not a viable approach as it requires too much communication. We then develop our new algorithm by modifying the previous approach through a sequence of steps with the main goal being to reduce the amount of communication and to increase load balance. The first goal is achieved by changing the algorithm so that many push and relabel operations can be performed locally between communication rounds and also by selecting augmenting paths that cross processor boundaries infrequently. To achieve good load balance, we limit the speed at which global relabelings traverse the graph. In several experiments on a large number of instances, we study weak and strong scalability of our algorithm using up to 128 processors.

The algorithm can also be used to find \(\epsilon \)-approximate matchings quickly.

### MSC:

65Y05 | Parallel numerical computation |

68M14 | Distributed systems |

68R10 | Graph theory (including graph drawing) in computer science |

PDF
BibTeX
XML
Cite

\textit{J. Langguth} et al., Parallel Comput. 37, No. 12, 820--845 (2011; Zbl 1248.65145)

Full Text:
DOI

### References:

[1] | Duff, I. S.: On algorithms for obtaining a maximum transversal, ACM transactions on mathematical software 7, No. 3, 315-330 (1981) |

[2] | Azad, A.; Langguth, J.; Fang, Y.; Qi, A.; Pothen, A.: Identifying rare cell populations in comparative flow cytometry, Lecture notes in computer science 6293, 162-175 (2010) |

[3] | Baxter, R. J.: Exactly solved models in statistical mechanics, (1982) · Zbl 0538.60093 |

[4] | Dias, J. R.; Milne, G. W. A.: Chemical applications of graph theory, Journal of chemical information and computer sciences 32, No. 1, 1 (1992) |

[5] | T.A. Davis, Y. Hu, The University of Florida sparse matrix collection, ACM Transactions on Mathematical Software, <http://www.cise.ufl.edu/research/sparse>, in press. · Zbl 1365.65123 |

[6] | Duff, I. S.: Algorithm 575: permutations for a zero-free diagonal [F1], ACM transactions on mathematical software 7, No. 3, 387-390 (1981) |

[7] | Edmonds, J.; Karp, R. M.: Theoretical improvements in algorithmic efficiency for network flow problems, Journal of the ACM 19, No. 2, 248-264 (1972) · Zbl 0318.90024 |

[8] | Alt, H.; Blum, N.; Mehlhorn, K.; Paul, M.: Computing a maximum cardinality matching in a bipartite graph in time \(O(n1.5m/logn)\), Information processing letters 37, No. 4, 237-240 (1991) · Zbl 0714.68036 |

[9] | Hopcroft, J. E.; Karp, R. M.: An n5/2 algorithm for maximum matchings in bipartite graphs, SIAM journal on computing 2, No. 4, 225-231 (1973) · Zbl 0266.05114 |

[10] | A.V. Goldberg, R.E. Tarjan, A new approach to the maximum flow problem, in: proceedings of the 18th Annual ACM Symposium on Theory of Computing, 1986, pp. 136 – 146. |

[11] | B.V. Cherkassky, A.V. Goldberg, P. Martin, J.C. Setubal, J. Stolfi, Augment or push: A computational study of bipartite matching and unit-capacity flow algorithms, ACM Journal of Experimental Algorithmics 3 (1999). · Zbl 1073.68903 |

[12] | I.S. Duff, K. Kaya, B. Uçar, Design, implementation, and analysis of maximum transversal algorithms, Tech. Rep. TR/PA/10/76, CERFACS, Toulouse, France, 2010. URL <http://www.cerfacs.fr/algor/reports/2010/TR_PA_10_76.pdf>. |

[13] | K. Kaya, J. Langguth, F. Manne, B. Uçar, Experiments on push-relabel based maximum cardinality matching algorithms for bipartite graphs, Tech.Rep.TR/PA/11/33, CERFACS, Toulouse, France, 2011. URL<http://www.cerfacs.fr/algor/reports/2011/TR_PA_11_33.pdf> |

[14] | D.A. Bader, V. Sachdeva, A cache-aware parallel implementation of the push-relabel network flow algorithm and experimental evaluation of the gap relabeling heuristic, in: proceedings of the 18th International Conference on Parallel and Distributed Computing Systems, ICPDCS 2005. |

[15] | Anderson, R.; Setubal, J. C.: A parallel implementation of the push-relabel algorithm for the maximum flow problem, Journal of parallel and distributed computing 29, No. 1, 17-26 (1995) |

[16] | J.C. Setubal, New experimental results for bipartite matching, in: proceedings of Network Optimization, Theory and Practice, NETFLOW 1993. |

[17] | L. Bus, P. Tvrdík, Distributed memory auction algorithms for the linear assignment problem, in: IASTED Parallel and Distributed Computing and Systems, IDCS 2001, 2002, pp. 137 – 142. |

[18] | J. Riedy, Making static pivoting scalable and dependable, Ph.D. thesis, EECS Department, University of California, Berkeley (Dec 2010). |

[19] | Schenk, O.; Manguoglu, M.; Sameh, A.; Christen, M.; Sathe, M.: Parallel scalable PDE-constrained optimization: antenna identification in hyperthermia cancer treatment planning, Computer science – research and development 23, 177-183 (2009) |

[20] | M. Manguoglu, A.H. Sameh, O. Schenk, PSPIKE: A parallel hybrid sparse linear system solver, in: proceedings of the 15th International European Conference on Parallel Processing, Euro-Par 2009. |

[21] | F. Manne, R.H. Bisseling, A parallel approximation algorithm for the weighted maximum matching problem, in: proceedings of the 7th International Conference on Parallel Processing and Applied Mathematics, vol. 4967 of PPAM 2007, Springer Berlin/Heidelberg, 2007, pp. 708 – 717. |

[22] | Ümit V. Çatalyürek, F. Dobrian, A. Gebremedhin, M. Halappanavar, A. Pothen, Distributed-memory parallel algorithms for matching and coloring, in: proceedings of IPDPS Workshop on Parallel Computing and Optimization, PCO 2011, 2011, pp. 1966 – 1975. |

[23] | Vastenhouw, B.; Bisseling, R. H.: A two-dimensional data distribution method for parallel sparse matrix-vector multiplication, SIAM review 47, No. 1, 67-95 (2005) · Zbl 1083.65044 |

[24] | Ümit V. Çatalyürek, E. Boman, K. Devine, D. Bozdağ, R. Heaphy, L. Riesen, Hypergraph-based dynamic load balancing for adaptive scientific computations, in: proceedings of the 21st International Parallel and Distributed Processing Symposium, IPDPS 2007, IEEE, 2007. |

[25] | M.M.A. Patwary, R.H. Bisseling, F. Manne, Parallel greedy graph matching using an edge partitioning approach, in: proceedings of the 4th ACM SIGPLAN Workshop on High-level Parallel Programming and Applications, HLPP 2010, 2010, pp. 45 – 54. |

[26] | Langguth, J.; Manne, F.; Sanders, P.: Heuristic initialization for bipartite matching problems, ACM journal of experimental algorithmics 15, 1.3:1-1.3:22 (2010) · Zbl 1284.68525 |

[27] | Hill, J. M. D.; Mccoll, B.; Stefanescu, D. C.; Goudreau, M. W.; Lang, K.; Rao, S. B.; Suel, T.; Tsantilas, T.; Bisseling, R. H.: Bsplib: the BSP programming library, Parallel computing 24, No. 14, 1947-1980 (1998) |

[28] | Korte, B. H.; Vygen, J.: Combinatorial optimization: theory and algorithms, (2006) · Zbl 1099.90054 |

[29] | Bisseling, R. H.: Parallel scientific computation: A structured approach using BSP and MPI, (2004) · Zbl 1059.65133 |

[30] | R.M. Karp, M. Sipser, Maximum matchings in sparse random graphs, in: proceedings of the 22nd Annual Symposium on Foundations of Computer Science, FOCS ’81, IEEE, 1981, pp. 364 – 375. |

[31] | J. Leskovec, Stanford network analysis platform. <http://snap.stanford.edu>, July 2009. |

[32] | J. Magun, Greedy matching algorithms, an experimental study, ACM Journal on Experimental Algorithmics 3 (1997). · Zbl 1073.68905 |

[33] | Feder, T.; Motwani, R.: Clique partitions, graph compression and speeding-up algorithms, Journal of computer and system sciences 51, No. 2, 261-272 (1995) · Zbl 0831.68073 |

[34] | M. Löhnertz, Algorithmen für Matchingprobleme in speziellen Graphklassen, Ph.D. thesis, Universität Bonn, 2010. |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.