zbMATH — the first resource for mathematics

High level implementation of geometric multigrid solvers for finite element problems: applications in atmospheric modelling. (English) Zbl 1422.65440
Summary: The implementation of efficient multigrid preconditioners for elliptic partial differential equations (PDEs) is a challenge due to the complexity of the resulting algorithms and corresponding computer code. For sophisticated (mixed) finite element discretisations on unstructured grids, an efficient implementation can be very time consuming and requires the programmer to have in-depth knowledge of the mathematical theory, parallel computing and optimisation techniques on manycore CPUs. In this paper, we show how the development of bespoke multigrid preconditioners can be simplified significantly by using a framework which allows the expression of the each component of the algorithm at the correct abstraction level. Our approach (1) allows the expression of the finite element problem in a language which is close to the mathematical formulation of the problem, (2) guarantees the automatic generation and efficient execution of parallel optimised low-level computer code and (3) is flexible enough to support different abstraction levels and give the programmer control over details of the preconditioner. We use the composable abstractions of the Firedrake/PyOP2 package to demonstrate the efficiency of this approach for the solution of strongly anisotropic PDEs in atmospheric modelling. The weak formulation of the PDE is expressed in unified form language (UFL) and the lower PyOP2 abstraction layer allows the manual design of computational kernels for a bespoke geometric multigrid preconditioner. We compare the performance of this preconditioner to a single-level method and hypre’s BoomerAMG algorithm. The Firedrake/PyOP2 code is inherently parallel and we present a detailed performance analysis for a single node (24 cores) on the ARCHER supercomputer. Our implementation utilises a significant fraction of the available memory bandwidth and shows very good weak scaling on up to 6,144 compute cores.

65N55 Multigrid methods; domain decomposition for boundary value problems involving PDEs
65N30 Finite element, Rayleigh-Ritz and Galerkin methods for boundary value problems involving PDEs
65Y05 Parallel numerical computation
65Y15 Packaged methods for numerical algorithms
86A10 Meteorology and atmospheric physics
Full Text: DOI
[1] Brezzi, F.; Douglas, J.; Fortin, M.; Marini, L. D., Efficient rectangular mixed finite elements in two and three space variables, ESAIM: Math. Model. Anal., 21, 4, 581-604, (1987) · Zbl 0689.65065
[2] Cotter, C. J.; Shipton, J., Mixed finite elements for numerical weather prediction, J. Comput. Phys., 231, 7076-7091, (2012) · Zbl 1284.86005
[3] McRae, A. T.T.; Cotter, C. J., Energy- and enstrophy-conserving schemes for the shallow-water equations, based on mimetic finite elements, Q. J. R. Meteorol. Soc., 140, 684, 2223-2234, (2014)
[4] Staniforth, A.; Thuburn, J., Horizontal grids for global weather and climate prediction models: a review, Q. J. R. Meteorol. Soc., 138, 662, 1-26, (2012)
[5] Börm, S.; Hiptmair, R., Analysis of tensor product multigrid, Numer. Algorithms, 26, 219-234, (2001) · Zbl 0973.65118
[6] Müller, E. H.; Scheichl, R., Massively parallel solvers for elliptic partial differential equations in numerical weather and climate prediction, Q. J. R. Meteorol. Soc., 140, 685, 2608-2624, (2014)
[7] Müller, E. H.; Scheichl, R.; Vainikko, E., Petascale elliptic solvers for anisotropic PDEs on GPU clusters, Parallel Comput., 50, 53-69, (2015)
[8] Dedner, A.; Müller, E.; Scheichl, R., Efficient multigrid preconditioners for atmospheric flow simulations at high aspect ratio, Int. J. Numer. Methods Fluids, 80, 1, 76-102, (2016)
[9] Rathgeber, F.; Markall, G. R.; Mitchell, L.; Loriant, N.; Ham, D. A.; Bertolli, C.; Kelly, P. H.J., Pyop2: a high-level framework for performance-portable simulations on unstructured meshes, (High Performance Computing, Networking Storage and Analysis, SC Companion, (2012), IEEE Computer Society), 1116-1123
[10] Rathgeber, F.; Ham, D. A.; Mitchell, L.; Lange, M.; Luporini, F.; McRae, A. T.T.; Bercea, G.-T.; Markall, G. R.; Kelly, P. H.J., Firedrake: automating the finite element method by composing abstractions, (2015), to appear in ACM TOMS · Zbl 1396.65144
[11] Bangerth, W.; Hartmann, R.; Kanschat, G., Deal.II - a general purpose object oriented finite element library, ACM Trans. Math. Softw., 33, 4, 24/1-24/27, (2007) · Zbl 1365.65248
[12] Bastian, P.; Heimann, F.; Marnach, S., Generic implementation of finite element methods in the distributed and unified numerics environment (DUNE), Kybernetika, 46, 2, 294-315, (2010) · Zbl 1195.65130
[13] Dedner, A.; Klöfkorn, R.; Nolte, M.; Ohlberger, M., A generic interface for parallel and adaptive discretization schemes: abstraction principles and the dune-FEM module, Computing, 90, 3, 165-196, (2010) · Zbl 1201.65178
[14] Bastian, P.; Blatt, M.; Scheichl, R., Algebraic multigrid for discontinuous Galerkin discretizations of heterogeneous elliptic problems, Numer. Linear Algebra Appl., 19, 2, 367-388, (2012) · Zbl 1274.65313
[15] Kanschat, G.; Mao, Y., Multigrid methods for \(H(\operatorname{div})\)-conforming discontinuous Galerkin methods for the Stokes equations, J. Numer. Math., 23, 1, 51-66, (2015) · Zbl 1330.76071
[16] (Logg, A.; Mardal, K.-A.; Wells, G. N., Automated Solution of Differential Equations by the Finite Element Method, (2012), Springer Berlin, Heidelberg) · Zbl 1247.65105
[17] Luporini, F.; Ham, D. A.; Kelly, P. H.J., An algorithm for the optimization of finite element integration loops, (2016), submitted for publication
[18] Balay, S.; Gropp, W. D.; McInnes, L. C.; Smith, B. F., Efficient management of parallelism in object oriented numerical software libraries, (Arge, E.; Bruaset, A. M.; Langtangen, H. P., Modern Software Tools in Scientific Computing, (1997), Birkhäuser Press), 163-202 · Zbl 0882.65154
[19] Balay, S.; Abhyankar, S.; Adams, M. F.; Brown, J.; Brune, P.; Buschelman, K.; Dalcin, L.; Eijkhout, V.; Gropp, W. D.; Kaushik, D.; Knepley, M. G.; McInnes, L. C.; Rupp, K.; Smith, B. F.; Zampini, S.; Zhang, H., Petsc users manual, (2015), Argonne National Laboratory, Tech. Rep. ANL-95/11 - Revision 3.6
[20] Henson, V. E.; Yang, U. M., Boomeramg: a parallel algebraic multigrid solver and preconditioner, Appl. Numer. Math., 41, 1, 155-177, (2002) · Zbl 0995.65128
[21] Falgout, R. D.; Meier-Yang, U., Hypre: a library of high performance preconditioners, (Sloot, P. M.A.; Tan, C. J.K.; Dongarra, J. J.; Hoekstra, A. G., Computational Science, ICCS 2002, Lecture Notes in Computer Science, vol. 2331, (2002), Springer), 632-641 · Zbl 1056.65046
[22] Logg, A.; Wells, G. N., DOLFIN: automated finite element computing, ACM Trans. Math. Softw., 37, 2, 20:1-20:28, (2010) · Zbl 1364.65254
[23] Brandt, A.; Livne, O., Multigrid techniques, (2011), Society for Industrial and Applied Mathematics (SIAM) Philadelphia
[24] Kirby, R. C., Algorithm 839: FIAT, a new paradigm for computing finite element basis functions, ACM Trans. Math. Softw., 30, 4, 502-516, (2004) · Zbl 1070.65571
[25] Alnæs, M. S.; Logg, A.; Ølgaard, K. B.; Rognes, M. E.; Wells, G. N., Unified form language: a domain-specific language for weak formulations of partial differential equations, ACM Trans. Math. Softw., 40, 2, 9:1-9:37, (2014) · Zbl 1308.65175
[26] Isaac, T.; Stadler, G.; Ghattas, O., Solution of nonlinear Stokes equations discretized by high-order finite elements on nonconforming and anisotropic meshes, with application to ice sheet dynamics, SIAM J. Sci. Comput., 37, 6, B804-B833, (2015) · Zbl 1327.65242
[27] Blatt, M.; Bastian, P., The iterative solver template library, (Kågström, B.; Elmroth, E.; Dongarra, J.; Wasniewski, J., Applied Parallel Computing. State of the Art in Scientific Computing, Lecture Notes in Computer Science, vol. 4699, (2007), Springer Berlin, Heidelberg), 666-675
[28] McRae, A. T.T.; Bercea, G.-T.; Mitchell, L.; Ham, D. A.; Cotter, C. J., Automated generation and symbolic manipulation of tensor product finite elements, SIAM J. Sci. Comput., (2016), in press · Zbl 1352.65615
[29] Crank, J.; Nicolson, P., A practical method for numerical evaluation of solutions of partial differential equations of the heat-conduction type, Adv. Comput. Math., 6, 207-226, (1996) · Zbl 0866.65054
[30] Raviart, P.-A.; Thomas, J. M., A mixed finite element method for 2nd order elliptic problems, (Mathematical Aspects of Finite Element Methods, Lecture Notes in Mathematics, vol. 606, (1977), Springer Berlin, Heidelberg), 292-315 · Zbl 0362.65089
[31] Davies, T.; Cullen, M. J.P.; Malcolm, A. J.; Mawson, M. H.; Staniforth, A.; White, A. A.; Wood, N., A new dynamical core for the met Office’s global and regional modelling of the atmosphere, Q. J. R. Meteorol. Soc., 131, 608, 1759-1782, (2005)
[32] Wood, N.; Staniforth, A.; White, A.; Allen, T.; Diamantakis, M.; Gross, M.; Melvin, T.; Smith, C.; Vosper, S.; Zerroukat, M.; Thuburn, J., An inherently mass-conserving semi-implicit semi-Lagrangian discretization of the deep-atmosphere global non-hydrostatic equations, Q. J. R. Meteorol. Soc., 140, 682, 1505-1520, (2014)
[33] Grote, M. J.; Huckle, T., Parallel preconditioning with sparse approximate inverses, SIAM J. Sci. Comput., 18, 3, 838-853, (1997) · Zbl 0872.65031
[34] Press, W. H.; Teukolsky, S. A.; Vetterling, W. T.; Flannery, B. P., Numerical recipes, (The Art of Scientific Computing, (2007), Cambridge University Press) · Zbl 1132.65001
[35] Saad, Y.; Schultz, M. H., GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 7, 3, 856-869, (1986) · Zbl 0599.65018
[36] Paige, C. C.; Saunders, M. A., Solution of sparse indefinite systems of linear equations, SIAM J. Numer. Anal., 12, 4, 617-629, (1975) · Zbl 0319.65025
[37] Thomas, S. J.; Hacker, J. P.; Smolarkiewicz, P. K.; Stull, R. B., Spectral preconditioners for nonhydrostatic atmospheric models, Mon. Weather Rev., 131, 10, 2464-2478, (2003)
[38] McCalpin, J. D., Memory bandwidth and machine balance in current high performance computers, IEEE Comput. Soc. Tech. Comm. Comput. Archit. Newsl., 19-25, (1995)
[39] Markall, G.; Rathgeber, F.; Mitchell, L.; Loriant, N.; Bertolli, C.; Ham, D. A.; Kelly, P. H.J., Performance-portable finite element assembly using pyop2 and fenics, Supercomputing, 7905, 279-289, (2013)
[40] Reguly, I. Z.; Mudalige, G. R.; Bertolli, C.; Giles, M. B.; Betts, A.; Kelly, P. H.J.; Radford, D., Acceleration of a full-scale industrial CFD application with OP2, IEEE Trans. Parallel Distrib. Syst., 27, 5, 1265-1278, (2016)
[41] Luporini, F.; Varbanescu, A. L.; Rathgeber, F.; Bercea, G.-T.; Ramanujam, J.; Ham, D. A.; Kelly, P. H.J., Cross-loop optimization of arithmetic intensity for finite element local assembly, ACM Trans. Archit. Code Optim., 11, 4, 57:1-57:25, (2015)
[42] Gropp, W. D.; Kaushik, D. K.; Keyes, D. E.; Smith, B. F., Towards realistic performance bounds for CFD codes, (Keyes, D.; Ecer, A.; Periaux, J.; Satofuka, N., Parallel CFD 1999, (2000), North-Holland), 241-248
[43] Brown, J., Efficient nonlinear solvers for nodal high-order finite elements in 3D, J. Sci. Comput., 45, 1-3, 48-63, (2010) · Zbl 1203.65245
[44] May, D. A.; Brown, J.; Le Pourhiet, L., Ptatin3D: high-performance methods for long-term lithospheric dynamics, (Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC’14, (2014), IEEE Press), 274-284
[45] Vos, P.; Sherwin, S.; Kirby, R., From h to p efficiently: implementing finite and spectral/hp element methods to achieve optimal performance for low-and high-order discretisations, J. Comput. Phys., 229, 13, 5161-5181, (2010) · Zbl 1194.65138
[46] Kronbichler, M.; Kormann, K., A generic interface for parallel cell-based finite element operator application, Comput. Fluids, 63, 135-147, (2012) · Zbl 1365.76121
[47] Wittum, G., Multi-grid methods for Stokes and Navier-Stokes equations, Numer. Math., 54, 5, 543-563, (1989) · Zbl 0645.76031
[48] Firedrake: an automated finite element system, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153970,.
[49] PyOP2: framework for performance-portable parallel computations on unstructured meshes, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153966.
[50] FIAT: the finite element automated tabulator, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153967.
[51] ffc: the FEniCS form compiler, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153971.
[52] COFFEE: a compiler for fast expression evaluation, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153965.
[53] PETSc: portable, extensible toolkit for scientific computation, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153972.
[54] petsc4py: the Python interface to PETSc, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153969.
[55] UFL: the Unified Form Language, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153968.
[56] Firedrake-helmholtzsolver: firedrake multigrid solvers for atmospheric modelling, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153963.
[57] Data and plotting scripts, Sep. 2016, http://dx.doi.org/10.5281/zenodo.153974.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.