×

KSPHPDDM and PCHPDDM: extending PETSc with advanced Krylov methods and robust multilevel overlapping Schwarz preconditioners. (English) Zbl 1524.65135

Summary: Contemporary applications in computational science and engineering often require the solution of linear systems which may be of different sizes, shapes, and structures. The goal of this paper is to explain how two libraries, PETSc and HPDDM, have been interfaced in order to offer end-users robust overlapping Schwarz preconditioners and advanced Krylov methods featuring recycling and the ability to deal with multiple right-hand sides. The flexibility of the implementation is showcased and explained with minimalist, easy-to-run, and reproducible examples, to ease the integration of these algorithms into more advanced frameworks. The examples provided cover applications from eigenanalysis, elasticity, combustion, and electromagnetism.

MSC:

65F10 Iterative numerical methods for linear systems
65N55 Multigrid methods; domain decomposition for boundary value problems involving PDEs
65Y05 Parallel numerical computation
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Bartlett, R.; Demeshko, I.; Gamblin, T.; Hammond, G.; Heroux, M. A.; Johnson, J.; Klinvex, A.; Li, X. S.; McInnes, L. C.; Moulton, J. D.; Osei-Kuffuor, D.; Sarich, J.; Smith, B. F.; Willenbring, J.; Yang, U. M., XSDK foundations: Toward an extreme-scale scientific software development kit, Supercomput. Front. Innov., 4, 1 (2017), URL https://xsdk.info
[2] Balay, S.; Abhyankar, S.; Adams, M. F.; Brown, J.; Brune, P.; Buschelman, K.; Dalcin, L.; Dener, A.; Eijkhout, V.; Gropp, W. D.; Karpeyev, D.; Kaushik, D.; Knepley, M. G.; May, D. A.; McInnes, L. C.; Mills, R. T.; Munson, T.; Rupp, K.; Sanan, P.; Smith, B. F.; Zampini, S.; Zhang, H.; Zhang, H., PETSc web page (2020), URL http://www.mcs.anl.gov/petsc
[3] Balay, S.; Abhyankar, S.; Adams, M. F.; Brown, J.; Brune, P.; Buschelman, K.; Dalcin, L.; Dener, A.; Eijkhout, V.; Gropp, W. D.; Karpeyev, D.; Kaushik, D.; Knepley, M. G.; May, D. A.; McInnes, L. C.; Mills, R. T.; Munson, T.; Rupp, K.; Sanan, P.; Smith, B. F.; Zampini, S.; Zhang, H.; Zhang, H., PETSC Users ManualTech. Rep. ANL-95/11 - Revision 3.13 (2020), Argonne National Laboratory
[4] Falgout, R.; Yang, U. M., Hypre: A library of high performance preconditioners, (Computational Science—ICCS, 2002 (2002)), 632-641, URL https://www.llnl.gov/casc/hypre · Zbl 1056.65046
[5] Si, H., TetGen: A Quality Tetrahedral Mesh Generator and 3D DElaunay TriangulatorTech. Rep, 13 (2013), URL http://wias-berlin.de/software/tetgen
[6] Burstedde, C.; Wilcox, L. C.; Ghattas, O., : Scalable algorithms for parallel adaptive mesh refinement on forests of octrees, SIAM J. Sci. Comput., 33, 3, 1103-1133 (2011), URL http://www.p4est.org · Zbl 1230.65106
[7] Jolivet, P.; Hecht, F.; Nataf, F.; Prud’homme, C., Scalable domain decomposition preconditioners for heterogeneous elliptic problems, (Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC13 (2013), ACM)
[8] Jolivet, P.; Tournier, P.-H., Block iterative methods and recycling for improved scalability of linear solvers, (Proceedings of the 2016 International Conference for High Performance Computing, Networking, Storage and Analysis, SC16 (2016), IEEE)
[9] T. Hoefler, R. Belli, Scientific benchmarking of parallel computing systems: Twelve ways to tell the masses when reporting performance results, in: Proceedings of the 2015 International Conference for High Performance Computing, Networking, Storage and Analysis, SC15, 2015.
[10] Hernandez, V.; Roman, J. E.; Vidal, V., SLEPc: A scalable and flexible toolkit for the solution of eigenvalue problems, ACM Trans. Math. Software, 31, 3, 351-362 (2005), URL https://slepc.upv.es · Zbl 1136.65315
[11] Barrett, R.; Berry, M.; Chan, T. F.; Demmel, J.; Donato, J.; Dongarra, J.; Eijkhout, V.; Pozo, R.; Romine, C.; Van der Vorst, H., Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods (1994), SIAM
[12] Saad, Y., Iterative Methods for Sparse Linear Systems (2003), SIAM · Zbl 1002.65042
[13] Saad, Y.; Schultz, M. H., GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7, 3, 856-869 (1986) · Zbl 0599.65018
[14] Hestenes, M. R.; Stiefel, E., Methods of conjugate gradients for solving linear systems, J. Res. Nat. Bureau Standards, 49, 6, 409-436 (1952) · Zbl 0048.09901
[15] Van der Vorst, H. A., Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 13, 2, 631-644 (1992) · Zbl 0761.65023
[16] Baker, A. H.; Jessup, E. R.; Manteuffel, T., A technique for accelerating the convergence of restarted GMRES, SIAM J. Matrix Anal. Appl., 26, 4, 962-984 (2005) · Zbl 1086.65025
[17] Erhel, J.; Burrage, K.; Pohl, B., Restarted GMRES preconditioned by deflation, J. Comput. Appl. Math., 69, 2, 303-318 (1996) · Zbl 0854.65025
[18] Wakam, D. N.; Pacull, F., Memory efficient hybrid algebraic solvers for linear systems arising from compressible flows, Comput. & Fluids, 80, 158-167 (2013) · Zbl 1284.76259
[19] Saad, Y., A flexible inner-outer preconditioned GMRES algorithm, SIAM J. Sci. Comput., 14, 2, 461-469 (1993) · Zbl 0780.65022
[20] Eisenstat, S. C.; Elman, H. C.; Schultz, M. H., Variational iterative methods for nonsymmetric systems of linear equations, SIAM J. Numer. Anal., 20, 2, 345-357 (1983) · Zbl 0524.65019
[21] Parks, M. L.; de Sturler, E.; Mackey, G.; Johnson, D. D.; Maiti, S., Recycling Krylov subspaces for sequences of linear systems, SIAM J. Sci. Comput., 28, 5, 1651-1674 (2006) · Zbl 1123.65022
[22] Gutknecht, M. H., Block krylov space methods for linear systems with multiple right-hand sides: An introduction, (Siddiqui, A.; Duff, I.; Christensen, O., Modern Mathematical Models, Methods and Algorithms for Real World Systems (2006)), 420-447
[23] Frommer, A.; Lund, K.; Szyld, D. B., Block krylov subspace methods for functions of matrices, Electron. Trans. Numer. Anal., 47, 100-126 (2017) · Zbl 1372.65138
[24] Tournier, P.-H.; Aliferis, I.; Bonazzoli, M.; de Buhan, M.; Darbas, M.; Dolean, V.; Hecht, F.; Jolivet, P.; El Kanfoud, I.; Migliaccio, C.; Nataf, F.; Pichot, C.; Semenov, S., Microwave tomographic imaging of cerebrovascular accidents by using high-performance computing, Parallel Comput., 85, 88-97 (2019)
[25] Kalantzis, V.; Malossi, A. C.I.; Bekas, C.; Curioni, A.; Gallopoulos, E.; Saad, Y., A scalable iterative dense linear system solver for multiple right-hand sides in data analytics, Parallel Comput., 74, 136-153 (2018)
[26] Knyazev, A. V., Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method, SIAM J. Sci. Comput., 23, 2, 517-541 (2001) · Zbl 0992.65028
[27] Calandra, H.; Gratton, S.; Langou, J.; Pinel, X.; Vasseur, X., Flexible variants of block restarted GMRES methods with application to geophysics, SIAM J. Sci. Comput., 34, 2, A714-A736 (2012) · Zbl 1248.65036
[28] Sakurai, T.; Tadano, H.; Kuramashi, Y., Application of block krylov subspace algorithms to the wilson-Dirac equation with multiple right-hand sides in lattice QCD, Comput. Phys. Comm., 181, 1, 113-117 (2010) · Zbl 1205.81131
[29] Heroux, M. A.; Bartlett, R. A.; Howle, V. E.; Hoekstra, R. J.; Hu, J. J.; Kolda, T. G.; Lehoucq, R. B.; Long, K. R.; Pawlowski, R. P.; Phipps, E. T., An overview of the trilinos project, ACM Trans. Math. Softw. (TOMS), 31, 3, 397-423 (2005), URL https://trilinos.github.io · Zbl 1136.65354
[30] Bavier, E.; Hoemmen, M.; Rajamanickam, S.; Thornquist, H., Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems, Sci. Program., 20, 3, 241-255 (2012)
[31] Notay, Y., Flexible conjugate gradients, SIAM J. Sci. Comput., 22, 4, 1444-1460 (2000) · Zbl 0980.65030
[32] Carvalho, L. M.; Gratton, S.; Lago, R.; Vasseur, X., A flexible generalized conjugate residual method with inner orthogonalization and deflated restarting, SIAM J. Matrix Anal. Appl., 32, 4, 1212-1235 (2011) · Zbl 1244.65047
[33] O’Leary, D. P., The block conjugate gradient algorithm and related methods, Linear Algebra Appl., 29, 293-322 (1980) · Zbl 0426.65011
[34] Ji, H.; Li, Y., A breakdown-free block conjugate gradient method, BIT Numer. Math., 57, 2, 379-403 (2017) · Zbl 1392.65057
[35] Saad, Y.; Yeung, M.; Erhel, J.; Guyomarc’h, F., A deflated version of the conjugate gradient algorithm, SIAM J. Sci. Comput., 21, 5, 1909-1926 (2000) · Zbl 0955.65021
[36] Stathopoulos, A.; Abdel-Rehim, A.; Orginos, K., Deflation for inversion with multiple right-hand sides in QCD, (Journal of Physics: Conference Series, Vol. 180 (2009), IOP Publishing)
[37] Soodhalter, K. M.; Szyld, D. B.; Xue, F., Krylov subspace recycling for sequences of shifted linear systems, Appl. Numer. Math., 81, 105-118 (2014) · Zbl 1291.65108
[38] Amestoy, P.; Duff, I.; L’Excellent, J.-Y.; Koster, J., A fully asynchronous multifrontal solver using distributed dynamic scheduling, SIAM J. Matrix Anal. Appl., 23, 1, 15-41 (2001), URL http://mumps.enseeiht.fr · Zbl 0992.65018
[39] Stathopoulos, A.; Wu, K., A block orthogonalization procedure with constant synchronization requirements, SIAM J. Sci. Comput., 23, 6, 2165-2182 (2002) · Zbl 1018.65050
[40] Gutknecht, M. H.; Schmelzer, T., Updating the \(Q R\) decomposition of block tridiagonal and block hessenberg matrices, Appl. Numer. Math., 58, 6, 871-883 (2008) · Zbl 1143.65032
[41] NVIDIA, cuSPARSE web page (2020), URL https://docs.nvidia.com/cuda/cusparse
[42] Boukaram, W.; Turkiyyah, G.; Keyes, D., Hierarchical matrix operations on GPUs: Matrix-vector multiplication and compression, ACM Trans. Math. Software, 45 (2019) · Zbl 1471.65027
[43] Ambartsumyan, I.; Boukaram, W.; Bui-Thanh, T.; Ghattas, O.; Keyes, D.; Stadler, G.; Turkiyyah, G.; Zampini, S., Hierarchical matrix approximations of hessians arising in inverse problems governed by PDEs, SIAM J. Sci. Comput., 42, 5, A3397-A3426 (2020) · Zbl 1453.65289
[44] Davis, T. A., Algorithm 832: UMFPACK—an unsymmetric-pattern multifrontal method, ACM Trans. Math. Software, 30, 2, 196-199 (2004) · Zbl 1072.65037
[45] Intel, MKL web page (2020), URL https://software.intel.com/content/www/us/en/develop/tools/math-kernel-library.html
[46] Li, X. S., An overview of superLU: Algorithms, implementation, and user interface, ACM Trans. Math. Software, 31, 3, 302-325 (2005) · Zbl 1136.65312
[47] Li, X. S.; Demmel, J., Superlu_dist: A scalable distributed-memory sparse direct solver for unsymmetric linear systems, ACM Trans. Math. Software, 29, 2, 110-140 (2003) · Zbl 1068.90591
[48] Poulson, J.; Marker, B.; Van de Geijn, R. A.; Hammond, J. R.; Romero, N. A., Elemental: A new framework for distributed memory dense matrix computations, ACM Trans. Math. Software, 39, 2 (2013) · Zbl 1295.65137
[49] Heinecke, A.; Henry, G.; Hutchinson, M.; Pabst, H., LIBXSMM: Accelerating small matrix multiplications by runtime code generation, (Proceedings of the 2016 International Conference for High Performance Computing, Networking, Storage and Analysis, SC16 (2016), IEEE), URL https://github.com/hfp/libxsmm
[50] Hecht, F., New development in FreeFem++, J. Numer. Math., 20, 3-4, 251-266 (2012), URL http://freefem.org · Zbl 1266.68090
[51] Moulin, J.; Jolivet, P.; Marquet, O., Augmented Lagrangian preconditioner for large-scale hydrodynamic stability analysis, Comput. Methods Appl. Mech. Engrg., 351, 718-743 (2019), URL https://github.com/prj-/moulin2019al · Zbl 1441.76069
[52] Stewart, G. W., A krylov-schur algorithm for large eigenproblems, SIAM J. Matrix Anal. Appl., 23, 3, 601-614 (2002) · Zbl 1003.65045
[53] Benzi, M.; Olshanskii, M. A.; Wang, Z., Modified augmented Lagrangian preconditioners for the incompressible Navier-Stokes equations, Internat. J. Numer. Methods Fluids, 66, 4, 486-508 (2011) · Zbl 1421.76152
[54] Sakurai, T.; Sugiura, H., A projection method for generalized eigenvalue problems using numerical integration, J. Comput. Appl. Math., 159, 1, 119-128 (2003) · Zbl 1037.65040
[55] Adams, M.; Bayraktar, H. H.; Keaveny, T. M.; Papadopoulos, P., Ultrascalable implicit finite element analyses in solid mechanics with over a half a billion degrees of freedom, (Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, SC04 (2004), IEEE Computer Society), 34:1-34:15
[56] Smith, B. F.; Bjørstad, P.; Gropp, W. D., Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations (2004), Cambridge University Press
[57] Dolean, V.; Jolivet, P.; Nataf, F., An Introduction to Domain Decomposition Methods: Algorithms, Theory and Parallel Implementation (2015), SIAM · Zbl 1364.65277
[58] Toselli, A.; Widlund, O. B., (Domain Decomposition Methods: Algorithms and Theory. Domain Decomposition Methods: Algorithms and Theory, Series in Computational Mathematics, vol. 34 (2005), Springer) · Zbl 1069.65138
[59] Pechstein, C., Finite and Boundary Element Tearing and Interconnecting Solvers for Multiscale Problems, Lecture Notes in Computational Science and Engineering (2012), Springer
[60] Quarteroni, A.; Valli, A., Domain Decomposition Methods for Partial Differential Equations, Vol. 10 (1999), Clarendon Press
[61] Mathew, T., Domain Decomposition Methods for the Numerical Solution of Partial Differential Equations, Vol. 61 (2008), Springer Science & Business Media
[62] Cai, X.-C.; Sarkis, M., A restricted additive Schwarz preconditioner for general sparse linear systems, SIAM J. Sci. Comput., 21, 2, 792-797 (1999) · Zbl 0944.65031
[63] Mandel, J., Balancing domain decomposition, Commun. Numer. Methods. Eng., 9, 3, 233-241 (1993) · Zbl 0796.65126
[64] Zampini, S., PCBDDC: A class of robust dual-primal methods in PETSc, SIAM J. Sci. Comput., 38, 5, S282-S306 (2016), URL https://www.mcs.anl.gov/petsc/petsc-master/docs/manualpages/PC/PCBDDC.html · Zbl 1352.65632
[65] Pechstein, C.; Dohrmann, C. R., A unified framework for adaptive BDDC, Electron. Trans. Numer. Anal., 46, 273-336 (2017) · Zbl 1368.65043
[66] Zampini, S.; Vassilevski, P.; Dobrev, V.; Kolev, T., Balancing domain decomposition by constraints algorithms for curl-conforming spaces of arbitrary order, (International Conference on Domain Decomposition Methods (2017), Springer), 103-116 · Zbl 1442.65435
[67] Oh, D.-S.; Widlund, O. B.; Zampini, S.; Dohrmann, C., BDDC algorithms with deluxe scaling and adaptive selection of primal constraints for Raviart-Thomas vector fields, Math. Comp., 87, 310, 659-692 (2018) · Zbl 1380.65065
[68] Da Veiga, L. B.; Pavarino, L. F.; Scacchi, S.; Widlund, O. B.; Zampini, S., Isogeometric BDDC preconditioners with deluxe scaling, SIAM J. Sci. Comput., 36, 3, A1118-A1139 (2014) · Zbl 1320.65047
[69] Gee, M. W.; Siefert, C. M.; Hu, J. J.; Tuminaro, R. S.; Sala, M. G., ML 5.0 Smoothed Aggregation User’s GuideTech. Rep. SAND2006-2649 (2006), URL https://trilinos.github.io/ml.html
[70] Heinlein, A.; Klawonn, A.; Rheinbach, O., A parallel implementation of a two-level overlapping Schwarz method with energy-minimizing coarse space based on Trilinos, SIAM J. Sci. Comput., 38, 6, C713-C747 (2016) · Zbl 1355.65168
[71] Šístek, J.; Mandel, J.; Sousedík, B.; Burda, P., Parallel implementation of multilevel BDDC, (Numerical Mathematics and Advanced Applications 2011 (2013), Springer), 681-689, URL http://users.math.cas.cz/sistek/software/bddcml.html · Zbl 1311.65178
[72] Badia, S.; Martín, A. F.; Principe, J., A highly scalable parallel implementation of balancing domain decomposition by constraints, SIAM J. Sci. Comput., 36, 2, C190-C218 (2014)
[73] Gander, M. J., Optimized Schwarz methods, SIAM J. Numer. Anal., 44, 2, 699-731 (2006) · Zbl 1117.65165
[74] Gander, M. J.; Van Criekingen, S., New coarse corrections for optimized restricted additive Schwarz using PETSc, (International Conference on Domain Decomposition Methods (2019), Springer)
[75] Spillane, N.; Dolean, V.; Hauret, P.; Nataf, F.; Pechstein, C.; Scheichl, R., A robust two-level domain decomposition preconditioner for systems of PDEs, Compt. R. Math., 349, 23, 1255-1259 (2011) · Zbl 1252.65201
[76] Spillane, N.; Dolean, V.; Hauret, P.; Nataf, F.; Pechstein, C.; Scheichl, R., Abstract robust coarse spaces for systems of PDEs via generalized eigenproblems in the overlaps, Numer. Math., 126, 4 (2013), 741-770 · Zbl 1291.65109
[77] Marchand, P.; Claeys, X.; Jolivet, P.; Nataf, F.; Tournier, P.-H., Two-level preconditioning for \(h\)-version boundary element approximation of hypersingular operator with GenEO, Numer. Math., 146, 597-628 (2020) · Zbl 1451.65215
[78] Brown, J.; Knepley, M. G.; May, D. A.; Curfman McInnes, L.; Smith, B. F., Composable linear solvers for multiphysics, (2012 11th International Symposium on Parallel and Distributed Computing (2012), IEEE), 55-62
[79] Butler, R.; Dodwell, T.; Reinarz, A.; Sandhu, A.; Scheichl, R.; Seelinger, L., High-performance dune modules for solving large-scale, strongly anisotropic elliptic problems with applications to aerospace composites, Comput. Phys. Commun., 249 (2020)
[80] D.A. May, P. Sanan, K. Rupp, M.G. Knepley, B.F. Smith, Extreme-scale multigrid components within PETSc, in: Proceedings of the Platform for Advanced Scientific Computing Conference, 2016.
[81] Knepley, M. G.; Karpeev, D. A., Mesh algorithms for PDE with Sieve I: Mesh distribution, Sci. Program., 17, 3, 215-230 (2009)
[82] Tang, J.; Nabben, R.; Vuik, C.; Erlangga, Y., Comparison of two-level preconditioners derived from deflation, domain decomposition and multigrid methods, J. Sci. Comput., 39, 3, 340-370 (2009) · Zbl 1203.65073
[83] Bastian, P.; Wittum, G.; Hackbusch, W., Additive and multiplicative multi-grid—a comparison, Computing, 60, 4, 345-364 (1998) · Zbl 0908.65107
[84] Haferssas, R.; Jolivet, P.; Nataf, F., An additive Schwarz method type theory for lions’s algorithm and a symmetrized optimized restricted additive Schwarz method, J. Sci. Comput., 39, 4, A1345-A1365 (2017) · Zbl 1371.65129
[85] Conen, L.; Dolean, V.; Krause, R.; Nataf, F., A coarse space for heterogeneous Helmholtz problems based on the Dirichlet-to-Neumann operator, J. Comput. Appl. Math., 271, 83-99 (2014) · Zbl 1321.65172
[86] Al Daas, H.; Grigori, L.; Jolivet, P.; Tournier, P.-H., A multilevel Schwarz preconditioner based on a hierarchy of robust coarse spaces, J. Sci. Comput. (2019), submitted for publication. URL https://github.com/prj-/aldaas2019multi
[87] Spillane, N.; Rixen, D. J., Automatic spectral coarse spaces for robust finite element tearing and interconnecting and balanced domain decomposition algorithms, Internat. J. Numer. Methods Engrg., 95, 11, 953-990 (2013) · Zbl 1352.65553
[88] Karypis, G.; Kumar, V., A fast and high quality multilevel scheme for partitioning irregular graphs, SIAM J. Sci. Comput., 20, 1, 359-392 (1998), URL http://glaros.dtc.umn.edu/gkhome/metis/parmetis/overview · Zbl 0915.68129
[89] Anderson, R.; Andrej, J.; Barker, A.; Bramwell, J.; Camier, J.-S.; Cerveny, J.; Dobrev, V.; Dudouit, Y.; Fisher, A.; Kolev, T.; Pazner, W.; Stowell, M.; Tomov, V.; Akkerman, I.; Dahm, J.; Medina, D.; Zampini, S., MFEM: A modular finite element methods library, Comput. Math. Appl., 81, 42-74 (2021), URL http://mfem.org · Zbl 1524.65001
[90] Zhou, Y.; Saad, Y., Block Krylov-Schur method for large symmetric eigenvalue problems, Numer. Algorithms, 47, 4, 341-359 (2008) · Zbl 1153.65330
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.