Comparative study of finite element methods using the time-accuracy-size (TAS) spectrum analysis. (English) Zbl 1417.65224

For comparison of different finite element methods using continuous and discontinuous Galerkin approaches, a performance analysis metrics is introduced. An extended performance spectrum model is presented, based on the work of J. Chang et al. [“A performance spectrum for parallel computational frameworks that solve PDEs”, Concurrency Comput. Pract. Exp. 30, No. 11, e4401 (2017; doi:10\.1002/cpe.4401)], which takes into account time-to-solution, accuracy of the solution and the problem size.
Thus hardware and algorithmic trade-offs can be interpreted. The proposed metrics are illustrated for the Poisson equation using various meshes on a \(2d\) unit square and a unit cube, for the latter using parallel computations.


65Y05 Parallel numerical computation
65N30 Finite element, Rayleigh-Ritz and Galerkin methods for boundary value problems involving PDEs
Full Text: DOI arXiv


[1] M. F. Adams, Evaluation of three unstructured multigrid methods on \(3\)D finite element problems in solid mechanics, Int. J. Numer. Methods Engrg., 55 (2002), pp. 519–534. · Zbl 1076.74547
[2] M. F. Adams, H. Bayraktar, T. Keaveny, and P. Papadopoulos, Ultrascalable implicit finite element analyses in solid mechanics with over a half a billion degrees of freedom, in Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (SC ’04), Pittsburgh, PA, 2004, 34.
[3] R. Agelek, M. Anderson, W. Bangerth, and W. L. Barth, On orienting edges of unstructured two-and three-dimensional meshes, ACM Trans. Math. Software (TOMS), 44 (2017), 5. · Zbl 1484.65320
[4] M. Aln\aes, J. Blechta, J. Hake, A. Johansson, B. Kehlet, A. Logg, C. Richardson, J. Ring, M. E. Rognes, and G. N. Wells, The FEniCS project version 1.5, Arch. Numer. Software, 3 (2015), pp. 9–23.
[5] G. M. Amdahl, Validity of the single processor approach to achieving large scale computing capabilities, in Proceedings of the April 18-20, 1967, Spring Joint Computer Conference, AFIPS ’67 (Spring), ACM, New York, 1967, pp. 483–485, .
[6] D. N. Arnold, F. Brezzi, B. Cockburn, and L. D. Marini, Unified analysis of discontinuous galerkin methods for elliptic problems, SIAM J. Numer. Anal., 39 (2002), pp. 1749–1779, . · Zbl 1008.65080
[7] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, D. A. May, L. C. McInnes, R. T. Mills, T. Munson, K. Rupp, P. Sanan, B. F. Smith, S. Zampini, H. Zhang, and H. Zhang, PETSc Users Manual, Tech. report ANL-95/11—Revision 3.9, Argonne National Laboratory, Lemont, IL, 2018.
[8] S. Balay, S. Abhyankar, M. F. Adams, J. Brown, P. Brune, K. Buschelman, L. Dalcin, V. Eijkhout, W. D. Gropp, D. Kaushik, M. G. Knepley, D. A. May, L. C. McInnes, R. T. Mills, T. Munson, K. Rupp, P. Sanan, B. F. Smith, S. Zampini, H. Zhang, and H. Zhang, PETSc Web Page, , 2018.
[9] W. Bangerth, D. Davydov, T. Heister, L. Heltai, G. Kanschat, M. Kronbichler, M. Maier, B. Turcksin, and D. Wells, The deal.II library, version 8.4, J. Numer. Math., 24 (2016), pp. 135–141. · Zbl 1348.65187
[10] G.-T. Bercea, A. T. T. McRae, D. A. Ham, L. Mitchell, F. Rathgeber, L. Nardi, F. Luporini, and P. H. J. Kelly, A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake, Geosci. Model Dev., 9 (2016), pp. 3803–3815, .
[11] S. C. Brenner and L. R. Scott, The Mathematical Theory of Finite Element Methods, Springer, New York, 2002. · Zbl 1012.65115
[12] J. Brown, Threading tradeoffs in domain decomposition, presented at the SIAM Conference on Parallel Processing for Scientific Computing as part of the Minisymposia “To Thread or Not to Thread,” Paris, 2016, .
[13] J. Brown, B. Smith, and A. Ahmadia, Achieving textbook multigrid efficiency for hydrostatic ice sheet flow, SIAM J. Sci. Comput., 35 (2013), pp. B359–B375, . · Zbl 1266.86001
[14] J. Chang, S. Karra, and K. B. Nakshatrala, Large-scale optimization-based non-negative computational framework for diffusion equations: Parallel implementation and performance studies, J. Sci. Comput., 70 (2017), pp. 243–271. · Zbl 1359.65250
[15] J. Chang and K. B. Nakshatrala, Variational inequality approach to enforce the non-negative constraint for advection-diffusion equations, Comput. Methods Appl. Mech. Engrg., 320 (2017), pp. 287–334.
[16] J. Chang, K. B. Nakshatrala, M. G. Knepley, and L. Johnsson, A performance spectrum for parallel computational frameworks that solve PDEs, Concurrency and Computation Practice and Experience, 30 (2017), e4401, .
[17] B. Cockburn, J. Gopalakrishnan, and R. Lazarov, Unified hybridization of discontinuous Galerkin, mixed, and continuous Galerkin methods for second order elliptic problems, SIAM J. Numer. Anal., 47 (2009), pp. 1319–1365, . · Zbl 1205.65312
[18] V. Eijkhout, Introduction to High Performance Scientific Computing, Texas Advanced Computing Center (TACC), The University of Texas at Austin, 2014, .
[19] M. S. Fabien, M. G. Knepley, and B. M. Rivière, A hybridizable discontinuous Galerkin method for two-phase flow in heterogeneous porous media, Int. J. Numer. Methods Engrg., 116 (2018), pp. 161–177, .
[20] R. D. Falgout and U. M. Yang, HYPRE: A library of high performance preconditioners, in Proceedings of the International Conference on Computational Science, Springer, New York, 2002, pp. 632–641. · Zbl 1056.65046
[21] D. Gaston, C. Newman, G. Hansen, and D. Lebrun-Grandie, MOOSE: A parallel computational framework for coupled systems of nonlinear equations, Nuclear Engrg. Des., 239 (2009), pp. 1768–1778.
[22] J. L. Gustafson, Reevaluating Amdahl’s law, Comm. ACM, 31 (1988), pp. 532–533, .
[23] M. Homolya and D. A. Ham, A parallel edge orientation algorithm for quadrilateral meshes, SIAM Journal on Scientific Computing, 38 (2016), pp. S48–S61, . · Zbl 1404.65266
[24] R. M. Kirby, S. J. Sherwin, and B. Cockburn, To CG or to HDG: A comparative study, J. Sci. Comput., 51 (2012), pp. 183–212. · Zbl 1244.65174
[25] B. S. Kirk, J. W. Peterson, R. H. Stogner, and G. F. Carey, libMesh: A C++ library for parallel adaptive mesh refinement/coarsening simulations, Engrg. Comput., 22 (2006), pp. 237–254, .
[26] M. G. Knepley, Computational Science I, Lecture Notes for CAAM 519, Department of Computational and Applied Mathematics, William Marsh Rice University, Houston, TX, 2017, .
[27] M. G. Knepley and D. A. Karpeev, Mesh algorithms for PDE with Sieve I: Mesh distribution, Sci. Programming, 17 (2009), pp. 215–230, .
[28] M. Lange, M. G. Knepley, and G. J. Gorman, Flexible, scalable mesh and data management using PETSc DMPlex, in Proceedings of the 3rd International Conference on Exascale Applications and Software Conference, Edinburgh, 2015, pp. 71–76, .
[29] M. Lange, L. Mitchell, M. G. Knepley, and G. J. Gorman, Efficient mesh management in Firedrake using PETSc DMPlex, SIAM J. Sci. Comput., 38 (2016), pp. S143–S155, . · Zbl 1352.65613
[30] A. Logg, Efficient representation of computational meshes, Int. J. Comput. Sci. Engrg., 4 (2009), pp. 283–295.
[31] N. K. Mapakshi, J. Chang, and K. B. Nakshatrala, A scalable variational inequality approach for flow through porous media models with pressure-dependent viscosity, J. Comput. Phys., 359 (2018), pp. 137–163. · Zbl 1383.76342
[32] D. A. May, J. Brown, and L. L. Laetitia, pTatin3D: High-performance methods for long-term lithospheric dynamics, in Proceedings of the International Conference for High Performance Computing, Network, Storage and Analysis (SC ’14), IEEE Press, Piscataway, NJ, 2014, pp. 274–284.
[33] A. T. T. McRae, G.-T. Bercea, L. Mitchell, D. A. Ham, and C. J. Cotter, Automated generation and symbolic manipulation of tensor product finite elements, SIAM J. Sci. Comput., 38 (2016), pp. S25–S47, . · Zbl 1352.65615
[34] H. Morgan, M. G. Knepley, P. Sanan, and L. R. Scott, A stochastic performance model for pipelined Krylov methods, Concurrency and Computation Practice and Experience, 28 (2016), pp. 4532–4542, .
[35] F. Rathgeber, D. A. Ham, L. Mitchell, M. Lange, F. Luporini, A. T. McRae, G.-T. Bercea, G. R. Markall, and P. H. Kelly, Firedrake: Automating the finite element method by composing abstractions, ACM Trans. Math. Software (TOMS), 43 (2016), 24. · Zbl 1396.65144
[36] P. A. Raviart and J. M. Thomas, A mixed finite element method for 2nd order elliptic problems, in Mathematical Aspects of Finite Element Methods, Springer, New York, 1977, pp. 292–315. · Zbl 0362.65089
[37] M. Sala, J. J. Hu, and R. S. Tuminaro, ML3.1 Smoothed Aggregation User’s Guide, Tech. report SAND2004-4821, Sandia National Laboratories, Albuquerque, NM, 2004.
[38] M. Shabouei and K. Nakshatrala, Mechanics-based solution verification for porous media models, Comm. Comput. Phys., 20 (2016), pp. 1127–1162. · Zbl 1373.76315
[39] K. Shahbazi, An explicit expression for the penalty parameter of the interior penalty method, J. Comput. Phys., 205 (2005), pp. 401–407. · Zbl 1072.65149
[40] S. Williams, A. Waterman, and D. Patterson, Roofline: An insightful visual performance model for multicore architectures, Comm. ACM, 54 (2009), pp. 65–76.
[41] Zenodo/coneoproject/Coffee, coneoproject/COFFEE: A Compiler for Fast Expression Evaluation, 2017, .
[42] Zenodo/FIAT, FIAT: The Finite Element Automated Tabulator, 2018, .
[43] Zenodo/FInAT, FInAT: A Smarter Library of Finite Elements, 2017, .
[44] Zenodo/Firedrake, Firedrake: An Automated Finite Element System, 2018, .
[45] Zenodo/PETSc, PETSc: Portable, Extensible Toolkit for Scientific Computation, 2017, .
[46] Zenodo/Petsc4py, Petsc4py: The Python Interface to PETSc, 2017, .
[47] Zenodo/PyOP2, OP2/PyOP2: Framework for Performance-Portable Parallel Computations on Unstructured Meshes, 2018, .
[48] Zenodo/TSFC, TSFC: The Two Stage Form Compiler, 2018, .
[49] Zenodo/UFL, UFL: The Unified Form Language, 2018, .
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.