zbMATH — the first resource for mathematics

Enclave tasking for DG methods on dynamically adaptive meshes. (English) Zbl 1455.65245
65M60 Finite element, Rayleigh-Ritz and Galerkin methods for initial value and initial-boundary value problems involving PDEs
65Y20 Complexity and performance of numerical algorithms
65Y05 Parallel numerical computation
Full Text: DOI
[1] M. Bader, M. Dumbser, A.-A. Gabriel, H. Igel, L. Rezzolla, and T. Weinzierl, ExaHyPE-An Exascale Hyperbolic PDE Solver Engine, 2019, http://www.exahype.org.
[2] A. Baggag, H. Atkins, C. Özturan, and D. Keyes, Parallelization of an object-oriented unstructured aeroacoustics solver, in Proceedings of the 9th SIAM Conference on Parallel Processing for Scientific Computing, SIAM, Philadelphia, 1999, pp. 22-24.
[3] W. Bangerth, C. Burstedde, T. Heister, and M. Kronbichler, Algorithms and data structures for massively parallel generic adaptive finite element codes, ACM Trans. Math. Softw., 38 (2011), 14. · Zbl 1365.65247
[4] M. Berger and P. Colella, Local adaptive mesh refinement for shock hydrodynamics, J. Comput. Phys., 82 (1989), pp. 64-84. · Zbl 0665.76070
[5] M. J. Berger and R. J. LeVeque, Adaptive mesh refinement using wave-propagation algorithms for hyperbolic systems, SIAM J. Numer. Anal., 35 (1998), pp. 2298-2316, https://doi.org/10.1137/S0036142997315974. · Zbl 0921.65070
[6] H.-J. Bungartz, M. Mehl, and T. Weinzierl, A parallel adaptive Cartesian PDE solver using space-filling curves, in Euro-Par 2006, W. E. Nagel, W. V. Walter, and W. Lehner, eds., Lecture Notes in Comput. Sci. 4128, Springer-Verlag, Berlin, Heidelberg, 2006, pp. 1064-1074.
[7] C. Burstedde, M. Burtscher, O. Ghattas, G. Stadler, T. Tu, and L. C. Wilcox, ALPS: A framework for parallel adaptive PDE solution, J. Phys. Conf. Ser., 180 (2009), 012009.
[8] D. E. Charrier, B. Hazelwood, E. Tutlyaeva, M. Bader, M. Dumbser, A. Kudryavtsev, A. Moskovsky, and T. Weinzierl, Studies on the energy and deep memory behaviour of a cache-oblivious, task-based hyperbolic PDE solver, Internat. J. High Perform. Comput. Appl., 33 (2019), pp. 973-986, https://doi.org/10.1177/1094342019842645.
[9] D. E. Charrier and T. Weinzierl, Stop Talking to Me-A Communication-Avoiding ADER-DG Realisation, preprint, https://arxiv.org/abs/1801.08682, 2018.
[10] J. Dongarra, J. Hittinger, J. Bell, L. Chacón, R. Falgout, M. Heroux, P. Hovland, E. Ng, C. Webster, and S. Wild, Applied Mathematics Research for Exascale Computing, DOE ASCR Exascale Mathematics Working Group, 2014, http://www.netlib.org/utk/people/JackDongarra/PAPERS/doe-exascale-math-report.pdf.
[11] S. Dosopoulos, J. D. Gardine, and J. F. Lee, An MPI/GPU parallelization of an interior penalty discontinuous Galerkin time domain method for Maxwell’s equations: MPI/GPU for IP-DGTD, Radio Sci., 46 (2011), https://doi.org/10.1029/2011RS004689 (accessed 2018-05-21).
[12] A. Dubey, A. S. Almgren, J. B. Bell, M. Berzins, S. R. Brandt, G. Bryan, P. Colella, D. T. Graves, M. Lijewski, F. Löffler, B. O’Shea, E. Schnetter, B. van Straalen, and K. Weide, A survey of high level frameworks in block-structured adaptive mesh refinement packages, J. Parallel Distributed Comput., 74 (2016), pp. 3217-3227.
[13] M. Dumbser, F. Fambri, M. Tavelli, M. Bader, and T. Weinzierl, Efficient implementation of ADER discontinuous Galerkin schemes for a scalable hyperbolic PDE engine, Axioms, 7 (2018), 63, https://doi.org/10.3390/axioms7030063. · Zbl 1434.65179
[14] M. Dumbser and M. Käser, An arbitrary high-order discontinuous Galerkin method for elastic waves on unstructured meshes-II. The three-dimensional isotropic case, Geophys. J. Internat., 167 (2006), pp. 319-336.
[15] M. Dumbser, O. Zanotti, R. Loubère, and S. Diot, A posteriori subcell limiting of the discontinuous Galerkin finite element method for hyperbolic conservation laws, J. Comput. Phys., 278 (2014), pp. 47-75. · Zbl 1349.65448
[16] C. R. Ferreira and M. Bader, Load balancing and patch-based parallel adaptive mesh refinement for tsunami simulation on heterogeneous platforms using Xeon Phi coprocessors, in PASC ’17: Proceedings of the Platform for Advanced Scientific Computing, ACM, New York, 2017, 12, https://doi.org/10.1145/3093172.3093237 (accessed 2018-05-21).
[17] N. Gödel, N. Nunn, T. Warburton, and M. Clemens, Scalability of higher-order discontinuous Galerkin FEM computations for solving electromagnetic wave propagation problems on GPU clusters, IEEE Trans. Magnetics, 46 (2010), pp. 3469-3472, https://doi.org/10.1109/TMAG.2010.2046022 (accessed 2018-05-21).
[18] M. Griebel and G. Zumbusch, Hash-storage techniques for adaptive multilevel solvers and their domain decomposition parallelization, in Proceedings of Domain Decomposition Methods 10 (DD10) (Boulder, CO, 1997), Contemp. Math. 218, AMS, Providence, RI, 1998, pp. 271-278. · Zbl 0910.65084
[19] A. Heinecke, A. Breuer, S. Rettenberger, M. Bader, A.-A. Gabriel, C. Pelties, A. Bode, W. Barth, X.-K. Liao, K. Vaidyanathan, M. Smelyanskiy, and P. Dubey, Petascale high order dynamic rupture earthquake simulations on heterogeneous supercomputers, in SC ’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (New Orleans, LA, 2014), IEEE, Washington, DC, 2014, pp. 3-14, https://doi.org/10.1109/SC.2014.6 (accessed 2016-03-01).
[20] F. Hindenlang, G. Gassner, C. Altmann, A. Beck, M. Staudenmaier, and C.-D. Munz, Explicit discontinuous Galerkin methods for unsteady problems, Comput. & Fluids, 61 (2012), pp. 86-93. · Zbl 1365.76117
[21] T. Hoefler and A. Lumsdaine, Message progression in parallel computing-to thread or not to thread?, in Proceedings of the 2008 IEEE International Conference on Cluster Computing, IEEE, Washington, DC, 2008, pp. 213-222, https://doi.org/10.1109/CLUSTR.2008.4663774.
[22] A. Ilic, F. Pratas, and L. Sousa, Cache-aware roofline model: Upgrading the loft, IEEE Comput. Architecture Lett., 13 (2014), pp. 21-24, https://doi.org/10.1109/L-CA.2013.6.
[23] T. Isaac, C. Burstedde, L. C. Wilcox, and O. Ghattas, Recursive algorithms for distributed forests of octrees, SIAM J. Sci. Comput., 37 (2015), pp. C497-C531, https://doi.org/10.1137/140970963. · Zbl 1323.65105
[24] A. Klöckner, T. Warburton, J. Bridge, and J. S. Hesthaven, Nodal discontinuous Galerkin methods on graphics processors, J. Comput. Phys., 228 (2009), pp. 7863-7882. · Zbl 1175.65111
[25] D. Komatitsch, G. Erlebacher, D. Göddeke, and D. Michéa, High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster, J. Comput. Phys., 229 (2010), pp. 7692-7714, https://doi.org/10.1016/j.jcp.2010.06.024 (accessed 2018-05-21). · Zbl 1194.86019
[26] K. Kormann and M. Kronbichler, Parallel finite element operator application: Graph partitioning and coloring, in Proceedings of the 2011 IEEE Seventh International Conference on eScience, ESCIENCE ’11, IEEE Computer Society, Washington, DC, 2011, pp. 332-339, https://doi.org/10.1109/eScience.2011.53 (accessed 2018-05-27).
[27] M. Kronbichler, K. Kormann, I. Pasichnyk, and M. Allalen, Fast matrix-free discontinuous Galerkin kernels on modern computer architectures, in High Performance Computing, J. M. Kunkel, R. Yokota, P. Balaji, and D. Keyes, eds., Lecture Notes in Comput. Sci. 10266, Springer, Cham, 2017, pp. 237-255, https://doi.org/10.1007/978-3-319-58667-0_13 (accessed 2018-05-27).
[28] R. J. LeVeque, Finite-Volume Methods for Hyperbolic Problems, Cambridge University Press, Cambridge, UK, 2002. · Zbl 1010.65040
[29] R. J. LeVeque, D. L. George, and M. J. Berger, Tsunami modelling with adaptively refined finite volume methods, Acta Numer., 20 (2011), pp. 211-289. · Zbl 1426.76394
[30] G. Mao, D. Böhme, M.-A. Hermanns, M. Geimer, D. Lorenz, and F. Wolf, Catching idlers with ease: A lightweight wait-state profiler for MPI programs, in Proceedings of the 21st European MPI Users’ Group Meeting, EuroMPI/ASIA ’14 (Kyoto, Japan, 2014), ACM, New York, 2014, pp. 103-108, https://doi.org/10.1145/2642769.2642783 (accessed 2019-07-17).
[31] J. D. McCalpin, Memory bandwidth and machine balance in current high performance computers, in IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, IEEE, Washington, DC, 1995, pp. 19-25.
[32] D. Mu, P. Chen, and L. Wang, Accelerating the discontinuous Galerkin method for seismic wave propagation simulations using multiple GPUs with CUDA and MPI, Earthquake Sci., 26 (2013), pp. 377-393, https://doi.org/10.1007/s11589-013-0047-7 (accessed 2018-05-21).
[33] A. Reinarz, D. E. Charrier, M. Bader, L. Bovard, M. Dumbser, K. Duru, F. Fambri, A.-A. Gabriel, J.-M. Gallard, S. Köppel, L. Krenz, L. Rannabauer, L. Rezzolla, P. Samfass, M. Tavelli, and T. Weinzierl, ExaHyPE: An Engine for Parallel Dynamically Adaptive Simulations of Wave Problems, preprint, https://arxiv.org/abs/1905.07987, 2020 (accessed 2019-05-22).
[34] J. Reinders, Intel Threading Building Blocks, O’Reilly & Associates, Sebastopol, CA, 2007.
[35] P. Samfass, T. Weinzierl, D. E. Charrier, and M. Bader, Tasks unlimited: Lightweight task offloading exploiting MPI wait times for parallel adaptive mesh refinement, in Concurrency and Computation: Practice and Experience, 2020, to appear; preprint, https://arxiv.org/abs/1909.06096, 2019.
[36] P. Samfass, T. Weinzierl, B. Hazelwood, and M. Bader, TeaMPI-replication-based resiliency without the (performance) pain, in Proceedings of the ISC High Performance 2020, Lecture Notes in Comput. Sci., Springer, Berlin, to appear.
[37] A. Sasidharan and M. Snir, MINIAMR-A Miniapp for Adaptive Meshrefinement, Tech. report, 2016, https://www.ideals.illinois.edu/handle/2142/91046.
[38] M. Schreiber, T. Weinzierl, and H. J. Bungartz, Cluster optimization and parallelization of simulations with dynamically adaptive grids, in Euro-Par 2013 Parallel Processing, F. Wolf, B. Mohr, and D. Mey, eds., Lecture Notes in Comput. Sci. 8097, Springer, Berlin, 2013, pp. 484-496.
[39] M. Sergent, M. Dagrada, P. Carribault, J. Jaeger, M. Pérache, and G. Papauré, Efficient communication/computation overlap with MPI+Openmp runtimes collaboration, in Euro-Par 2018: Parallel Processing, M. Aldinucci, L. Padovani, and M. Torquati, eds., Lecture Notes in Comput. Sci. 11014, Springer, Berlin, 2018, pp. 560-572.
[40] H. Sundar and O. Ghattas, A nested partitioning algorithm for adaptive meshes on heterogeneous clusters, in Proceedings of the 29th ACM on International Conference on Supercomputing, ICS ’15, ACM, New York, 2015, pp. 319-328, https://doi.org/10.1145/2751205.2751246.
[41] H. Sundar, R. S. Sampath, and G. Biros, Bottom-up construction and 2:1 balance refinement of linear octrees in parallel, SIAM J. Sci. Comput., 30 (2008), pp. 2675-2708, https://doi.org/10.1137/070681727. · Zbl 1186.68554
[42] M. Tavelli, M. Dumbser, D. E. Charrier, L. Rannabauer, T. Weinzierl, and M. Bader, A simple diffuse interface approach on adaptive Cartesian grids for the linear elastic wave equations with complex topography, J. Comput. Phys., 386 (2019), pp. 158-189, https://doi.org/10.1016/j.jcp.2019.02.004.
[43] C. Uphoff, S. Rettenberger, M. Bader, E. H. Madden, T. Ulrich, S. Wollherr, and A. A. Gabriel, Extreme scale multi-physics simulations of the tsunamigenic 2004 Sumatra megathrust earthquake, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’17, ACM, New York, 2017, 21, https://doi.org/10.1145/3126908.3126948.
[44] M. Weinzierl and T. Weinzierl, Quasi-matrix-free hybrid multigrid on dynamically adaptive Cartesian grids, ACM Trans. Math. Softw., 44 (2018), 32. · Zbl 06920095
[45] T. Weinzierl, The Peano software-parallel, automaton-based, dynamically adaptive grid traversals, ACM Trans. Math. Softw., 45 (2019), 14. · Zbl 07119133
[46] T. Weinzierl and M. Mehl, Peano-a traversal and storage scheme for octree-like adaptive Cartesian multiscale grids, SIAM J. Sci. Comput., 33 (2011), pp. 2732-2760, https://doi.org/10.1137/100799071. · Zbl 1245.65169
[47] S. Williams, A. Waterman, and D. Patterson, Roofline: An insightful visual performance model for multicore architectures, Commun. ACM, 52 (2009), pp. 65-76, https://doi.org/10.1145/1498765.1498785.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.