Reducing communication costs for sparse matrix multiplication within algebraic multigrid. (English) Zbl 1339.65058


65F30 Other matrix algorithms (MSC2010)
65F50 Computational methods for sparse matrices
65N55 Multigrid methods; domain decomposition for boundary value problems involving PDEs
65F10 Iterative numerical methods for linear systems
Full Text: DOI


[1] K. Akbudak and C. Aykanat, Simultaneous input and output matrix partitioning for outer-product–parallel sparse matrix-matrix multiplication, SIAM J. Sci. Comput., 36 (2014), pp. C568–C590, http://dx.doi.org/10.1137/13092589X. · Zbl 1307.65050
[2] C. G. Baker and M. A. Heroux, Tpetra, and the use of generic programming in scientific computing, Sci. Program., 20 (2012), pp. 115–128, http://dx.doi.org/10.1155/2012/693861.
[3] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, L. C. McInnes, K. Rupp, B. F. Smith, S. Zampini, and H. Zhang, PETSc Web Page, http://www.mcs.anl.gov/petsc (2016).
[4] G. Ballard, A. Buluç, J. Demmel, L. Grigori, B. Lipshitz, O. Schwartz, and S. Toledo, Communication optimal parallel multiplication of sparse random matrices, in Proceedings of the 25th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA ’13), ACM, New York, 2013, pp. 222–231, http://dx.doi.org/10.1145/2486159.2486196.
[5] G. Ballard, A. Druinsky, N. Knight, and O. Schwartz, Brief announcement: Hypergraph partitioning for parallel sparse matrix-matrix multiplication, in Proceedings of the 27th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA ’15), ACM, New York, 2015, pp. 86–88, http://doi.acm.org/10.1145/2755573.2755613.
[6] N. Bell, S. Dalton, and L. N. Olson, Exposing fine-grained parallelism in algebraic multigrid methods, SIAM J. Sci. Comput., 34 (2012), pp. C123–C152, http://dx.doi.org/10.1137/110838844. · Zbl 1253.65041
[7] U. Borštnik, J. VandeVondele, V. Weber, and J. Hutter, Sparse matrix multiplication: The distributed block-compressed sparse row library, Parallel Comput., 40 (2014), pp. 47–58, http://dx.doi.org/10.1016/j.parco.2014.03.012.
[8] A. Brandt and O. E. Livne, Multigrid Techniques: 1984 Guide with Applications to Fluid Dynamics, SIAM, Philadelphia, 2011, http://dx.doi.org/10.1137/1.9781611970753. · Zbl 1227.65121
[9] W. L. Briggs, V. E. Henson, and S. F. McCormick, A Multigrid Tutorial, 2nd ed., SIAM, Philadelphia, 2000.
[10] A. Buluç and J. R. Gilbert, Parallel sparse matrix-matrix multiplication and indexing: Implementation and experiments, SIAM J. Sci. Comput., 34 (2012), pp. C170–C191, http://dx.doi.org/10.1137/110848244.
[11] L. Cannon, A Cellular Computer to Implement the Kalman Filter Algorithm, Ph.D. thesis, Montana State University, Bozeman, MN, 1969.
[12] M. Challacombe, A general parallel sparse-blocked matrix multiply for linear scaling SCF theory, Comput. Phys. Comm., 128 (2000), pp. 93–107, http://dx.doi.org/10.1016/S0010-4655(00)00074-6. · Zbl 1003.81582
[13] S. Dalton, L. Olson, and N. Bell, Optimizing sparse matrix-matrix multiplication for the GPU, ACM Trans. Math. Software, 41 (2015), 25, http://doi.acm.org/10.1145/2699470. · Zbl 1347.65085
[14] R. Falgout, T. Kolev, U. M. Yang, J. Schroder, and P. Vassilevski, Hypre Web Page, http://computation.llnl.gov/project/linear_solvers/software.php (2015).
[15] M. Gee, C. Siefert, J. Hu, R. Tuminaro, and M. Sala, ML 5.0 Smoothed Aggregation User’s Guide, Technical Report SAND2006-2649, Sandia National Laboratories, Albuquerque, NM, 2006.
[16] F. Gremse, A. Höfter, L. Schwen, F. Kiessling, and U. Naumann, GPU-accelerated sparse matrix-matrix multiplication by iterative row merging, SIAM J. Sci. Comput., 37 (2015), pp. C54–C71, http://dx.doi.org/10.1137/130948811. · Zbl 1327.65090
[17] W. Hackbusch, Multigrid Methods and Applications, Springer Ser. Comput. Math. 4, Springer-Verlag, Berlin, 1985.
[18] M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley, An overview of the Trilinos project, ACM Trans. Math. Software, 31 (2005), pp. 397–423, http://doi.acm.org/10.1145/1089014.1089021. · Zbl 1136.65354
[19] M. A. Heroux and J. M. Willenbring, A new overview of the Trilinos project, Sci. Program., 20 (2012), pp. 83–88, http://dx.doi.org/10.1155/2012/408130.
[20] P. Lin, M. Bettencourt, S. Domino, T. Fisher, M. Hoemmen, J. Hu, E. Phipps, A. Prokopenko, S. Rajamanickam, C. Siefert, and S. Kennon, Towards extreme-scale simulations for low Mach fluids with second-generation Trilinos, Parallel Process. Lett., 24 (2014), 1442005, http://dx.doi.org/10.1142/S0129626414420055.
[21] M. McCourt, B. Smith, and H. Zhang, Sparse matrix-matrix products executed through coloring, SIAM J. Matrix Anal. Appl., 36 (2015), pp. 90–109, http://dx.doi.org/10.1137/13093426X. · Zbl 1327.65092
[22] A. Prokopenko, J. J. Hu, T. A. Wiesner, C. M. Siefert, and R. S. Tuminaro, MueLu User’s Guide 1.0, Technical Report SAND2014-18874, Sandia National Laboratories, Albuquerque, NM, 2014.
[23] U. Trottenberg, C. W. Oosterlee, and A. Schüller, Multigrid, Academic Press, San Diego, 2001.
[24] R. S. Tuminaro and C. Tong, Parallel smoothed aggregation multigrid: Aggregation strategies on massively parallel machines, in Proceedings of the ACM/IEEE 2000 Conference on Supercomputing, 2000, 5, http://doi.ieeecomputersociety.org/10.1109/SC.2000.10008.
[25] R. A. van de Geijn and J. Watts, SUMMA: Scalable universal matrix multiplication algorithm, Concurrency Pract. Exp., 9 (1997), pp. 255–274, http://dx.doi.org/10.1002/(SICI)1096-9128(199704)9:4(255::AID-CPE250)3.0.CO;2-2.
[26] P. Vaněk, J. Mandel, and M. Brezina, Algebraic multigrid based on smoothed aggregation for second and fourth order problems, Computing, 56 (1996), pp. 179–196, http://dx.doi.org/10.1007/BF02238511. · Zbl 0851.65087
[27] U. Yang, private communication, 2015.
[28] H. Zhang, private communication, 2015.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.