×

Extreme-scale block-structured adaptive mesh refinement. (English) Zbl 06890193

Summary: In this article, we present a novel approach for block-structured adaptive mesh refinement (AMR) that is suitable for extreme-scale parallelism. All data structures are designed such that the size of the metadata in each distributed processor memory remains bounded independent of the processor number. In all stages of the AMR process, we use only distributed algorithms. No central resources such as a master process or replicated data are employed, so that an unlimited scalability can be achieved. For the dynamic load balancing in particular, we propose to exploit the hierarchical nature of the block-structured domain partitioning by creating a lightweight, temporary copy of the core data structure. This copy acts as a local and fully distributed proxy data structure. It does not contain simulation data but only provides topological information about the domain partitioning into blocks. Ultimately, this approach enables an inexpensive, local, diffusion-based dynamic load balancing scheme. We demonstrate the excellent performance and the full scalability of our new AMR implementation for two architecturally different petascale supercomputers. Benchmarks on an IBM Blue Gene/Q system with a mesh containing 3.7 trillion unknowns distributed to 458,752 processes confirm the applicability for future extreme-scale parallel machines. The algorithms proposed in this article operate on blocks that result from the domain partitioning. This concept and its realization support the storage of arbitrary data. In consequence, the software framework can be used for different simulation methods, including mesh-based and meshless methods. In this article, we demonstrate fluid simulations based on the lattice Boltzmann method.

MSC:

65Y05 Parallel numerical computation
65Y20 Complexity and performance of numerical algorithms
76P05 Rarefied gas flows, Boltzmann equation in fluid mechanics
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] M. Adams, P. Colella, D. T. Graves, J. N. Johnson, H. S. Johansen, N. D. Keen, T. J. Ligocki, D. F. Martin, P. W. McCorquodale, D. Modiano, P. O. Schwartz, T. D. Sternberg, and B. V. Straalen, Chombo Software Package for AMR Applications – Design Document, Tech. Report, Lawrence Berkeley National Laboratory, 2015.
[2] C. K. Aidun and J. R. Clausen, Lattice-Boltzmann method for complex flows, in Annual Review of Fluid Mechanics, Annu. Rev. Fluid Mech. 42, Annual Reviews, 2010, pp. 439–472, . · Zbl 1345.76087
[3] D. Bartuschat and U. Rüde, Parallel multiphysics simulations of charged particles in microfluidic flows, J. Comput. Sci., 8 (2015), pp. 1–19, .
[4] S. Becker, S. Kniesburges, S. Müller, A. Delgado, G. Link, M. Kaltenbacher, and M. Döllinger, Flow-structure-acoustic interaction in a human voice model, J. Acoust. Soc. Amer., 125 (2009), pp. 1351–1361, .
[5] J. E. Boillat, Load balancing and Poisson equation in a graph, Concurr. Comp.-Pract. E., 2 (1990), pp. 289–313, .
[6] E. G. Boman, U. V. Catalyurek, C. Chevalier, and K. D. Devine, The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: Partitioning, ordering, and coloring, Sci. Program., 20 (2012), pp. 129–150, .
[7] Boost \CPP libraries, (accessed 2018-04-15).
[8] J. Bordner and M. L. Norman, Enzo-P / Cello: Scalable adaptive mesh refinement for astrophysics and cosmology, in Proceedings of the Extreme Scaling Workshop, BW-XSEDE ’12, University of Illinois at Urbana-Champaign, 2012, pp. 4:1–4:11.
[9] BoxLib, (accessed 2017-08-31).
[10] G. L. Bryan, M. L. Norman, B. W. O’Shea, T. Abel, J. H. Wise, M. J. Turk, D. R. Reynolds, D. C. Collins, P. Wang, S. W. Skillman, B. Smith, R. P. Harkness, J. Bordner, J. Kim, M. Kuhlen, H. Xu, N. Goldbaum, C. Hummels, A. G. Kritsuk, E. Tasker, S. Skory, C. M. Simpson, O. Hahn, J. S. Oishi, G. C. So, F. Zhao, R. Cen, and Y. Li, ENZO: An adaptive mesh refinement code for astrophysics, Astrophys. J. Suppl. S., 211 (2014), 19, .
[11] H.-J. Bungartz, M. Mehl, T. Neckel, and T. Weinzierl, The PDE framework Peano applied to fluid dynamics: An efficient implementation of a parallel multiscale fluid dynamics solver on octree-like adaptive Cartesian grids, Comput. Mech., 46 (2010), pp. 103–114, . · Zbl 1301.76056
[12] C. Burstedde, D. Calhoun, K. Mandli, and A. R. Terrel, ForestClaw: Hybrid forest-of-octrees AMR for hyperbolic conservation laws, Adv. Parallel Comput., 25 (2014), pp. 253–262, .
[13] C. Burstedde, L. C. Wilcox, and O. Ghattas, p4est: Scalable algorithms for parallel adaptive mesh refinement on forests of octrees, SIAM J. Sci. Comput., 33 (2011), pp. 1103–1133, . · Zbl 1230.65106
[14] Cactus, (accessed 2018-04-15). · Zbl 1051.83539
[15] P. M. Campbell, K. D. Devine, J. E. Flaherty, L. G. Gervasio, and J. D. Teresco, Dynamic Octree Load Balancing Using Space-Filling Curves, Tech. Report CS-03-01, Williams College Department of Computer Science, 2003.
[16] Carpet, (accessed 2018-04-15). · Zbl 1010.60075
[17] H. Chen, O. Filippova, J. Hoch, K. Molvig, R. Shock, C. Teixeira, and R. Zhang, Grid refinement in lattice Boltzmann methods based on volumetric formulation, Phys. A, 362 (2006), pp. 158–167, .
[18] S. Chen and G. D. Doolen, Lattice Boltzmann method for fluid flows, in Annual Review of Fluid Mechanics, Annu. Rev. Fluid Mech. 30, Annual Reviews, 1998, pp. 329–364, .
[19] C. Chevalier and F. Pellegrini, PT-Scotch: A tool for efficient parallel graph ordering, Parallel Comput., 34 (2008), pp. 318–331, .
[20] G. Cybenko, Dynamic load balancing for distributed memory multiprocessors, J. Parallel Distrib. Comput., 7 (1989), pp. 279–301, .
[21] J.-C. Desplat, I. Pagonabarraga, and P. Bladon, LUDWIG: A parallel Lattice-Boltzmann code for complex fluids, Comput. Phys. Commun., 134 (2001), pp. 273–290, . · Zbl 1032.76055
[22] A. Dubey, A. Almgren, J. Bell, M. Berzins, S. Brandt, G. Bryan, P. Colella, D. Graves, M. Lijewski, F. Löffler, B. O’Shea, E. Schnetter, B. V. Straalen, and K. Weide, A survey of high level frameworks in block-structured adaptive mesh refinement packages, J. Parallel Distrib. Comput., 74 (2014), pp. 3217–3227, .
[23] A. Dubey, K. Antypas, M. K. Ganapathy, L. B. Reid, K. Riley, D. Sheeler, A. Siegel, and K. Weide, Extensible component-based architecture for FLASH, a massively parallel, multiphysics simulation code, Parallel Comput., 35 (2009), pp. 512–522, .
[24] Enzo-P/Cello, (accessed 2018-04-15).
[25] A. Fakhari, M. Geier, and T. Lee, A mass-conserving lattice Boltzmann method with dynamic grid refinement for immiscible two-phase flows, J. Comput. Phys., 315 (2016), pp. 434–457, . · Zbl 1349.76678
[26] A. Fakhari and T. Lee, Finite-difference lattice Boltzmann method with a block-structured adaptive-mesh-refinement technique, Phys. Rev. E (3), 89 (2014), 033310, .
[27] J. Fietz, M. J. Krause, C. Schulz, P. Sanders, and V. Heuveline, Optimized hybrid parallel lattice Boltzmann fluid flow simulations on complex geometries, in Euro-Par 2012 Parallel Processing, Springer, 2012, pp. 818–829, .
[28] FLASH, (accessed 2018-04-15). · Zbl 1321.94065
[29] S. Freudiger, J. Hegewald, and M. Krafczyk, A parallelisation concept for a multi-physics lattice Boltzmann prototype based on hierarchical grids, Progr. Comput. Fluid Dynam., 8 (2008), pp. 168–178, . · Zbl 1388.76289
[30] C. Godenschwager, F. Schornbaum, M. Bauer, H. Köstler, and U. Rüde, A framework for hybrid parallel flow simulations with a trillion cells in complex geometries, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’13, ACM, 2013, pp. 35:1–35:12, .
[31] T. Goodale, G. Allen, G. Lanfermann, J. Massó, T. Radke, E. Seidel, and J. Shalf, The Cactus framework and toolkit: Design and applications, in Vector and Parallel Processing – VECPAR’2002, 5th International Conference, Lecture Notes in Comput. Sci. 2565, Springer, 2003, . · Zbl 1027.65524
[32] J. Götz, K. Iglberger, M. Stürmer, and U. Rüde, Direct numerical simulation of particulate flows on 294912 processor cores, in Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, IEEE Computer Society, 2010, pp. 1–11, .
[33] D. Groen, O. Henrich, F. Janoschek, P. Coveney, and J. Harting, Lattice-Boltzmann methods in fluid dynamics: Turbulence and complex colloidal fluids, in Juelich Blue Gene/P Extreme Scaling Workshop 2011, Juelich Supercomputing Centre, 2011.
[34] D. Groen, J. Hetherington, H. B. Carver, R. W. Nash, M. O. Bernabeu, and P. V. Coveney, Analysing and modelling the performance of the HemeLB lattice-Boltzmann simulation environment, J. Comput. Sci., 4 (2013), pp. 412–422, .
[35] M. Hasert, K. Masilamani, S. Zimny, H. Klimach, J. Qi, J. Bernsdorf, and S. Roller, Complex fluid simulations with the parallel tree-based Lattice Boltzmann solver Musubi, J. Comput. Sci., 5 (2014), pp. 784–794, .
[36] V. Heuveline and J. Latt, The OpenLB project: An open source and object oriented implementation of lattice Boltzmann methods, Int. J. Mod. Phys. C, 18 (2007), pp. 627–634, . · Zbl 1388.76293
[37] D. Hilbert, Ueber die stetige Abbildung einer Linie auf ein Flächenstück, Math. Ann., 38 (1891), pp. 459–460. · JFM 23.0422.01
[38] L. V. Kale and G. Zheng, Charm++ and AMPI: Adaptive runtime strategies via migratable objects, in Advanced Computational Infrastructures for Parallel and Distributed Adaptive Applications, John Wiley & Sons, 2009, pp. 265–282, .
[39] D. Lagrava, O. Malaspinas, J. Latt, and B. Chopard, Advances in multi-domain lattice Boltzmann grid refinement, J. Comput. Phys., 231 (2012), pp. 4808–4822, . · Zbl 1246.76131
[40] M. Lahnert, C. Burstedde, C. Holm, M. Mehl, G. Rempfer, and F. Weik, Towards lattice-Boltzmann on dynamically adaptive grids – minimally-invasive grid exchange in ESPResSo, in Proceedings of the ECCOMAS Congress 2016, VII European Congress on Computational Methods in Applied Sciences and Engineering, 2016, pp. 1–25.
[41] LB3D, (accessed 2018-04-15).
[42] P. MacNeice, K. M. Olson, C. Mobarry, R. de Fainchtein, and C. Packer, PARAMESH: A parallel adaptive mesh refinement community toolkit, Comput. Phys. Commun., 126 (2000), pp. 330–354, . · Zbl 0953.65088
[43] R. C. Martin, The open-closed principle, C++ Report, 1996.
[44] M. Mehl, T. Neckel, and P. Neumann, Navier-Stokes and Lattice-Boltzmann on octree-like grids in the Peano framework, Internat. J. Numer. Methods Fluids, 65 (2011), pp. 67–86, . · Zbl 1427.76188
[45] B. Meyer, Object-Oriented Software Construction, Prentice–Hall, 1988.
[46] G. M. Morton, A Computer Oriented Geodetic Data Base; and a New Technique in File Sequencing, Tech. Report, IBM Ltd., 1966.
[47] P. Neumann and T. Neckel, A dynamic mesh refinement technique for Lattice Boltzmann simulations on octree-like grids, Comput. Mech., 51 (2013), pp. 237–253, . · Zbl 1312.76051
[48] OpenLB, (accessed 2018-04-15).
[49] Palabos, (accessed 2018-04-15).
[50] PARAMESH, (accessed 2018-04-15).
[51] S. G. Parker, A component-based architecture for parallel multi-physics PDE simulation, Future Gener. Comput. Syst., 22 (2006), pp. 204–216, .
[52] ParMETIS, (accessed 2018-04-15).
[53] T. Preclik and U. Rüde, Ultrascale simulations of non-smooth granular dynamics, Comput. Particle Mech., 2 (2015), pp. 173–196, .
[54] PT-Scotch, (accessed 2018-04-15).
[55] A. Randles, Modeling Cardiovascular Hemodynamics Using the Lattice Boltzmann Method on Massively Parallel Supercomputers, Ph.D. thesis, Harvard University, 2013.
[56] M. Rohde, D. Kandhai, J. J. Derksen, and H. E. A. van den Akker, A generic, mass conservative local grid refinement technique for lattice-Boltzmann schemes, Internat. J. Numer. Methods Fluids, 51 (2006), pp. 439–468, . · Zbl 1276.76060
[57] K. Schloegel, G. Karypis, and V. Kumar, Parallel static and dynamic multi-constraint graph partitioning, Concurr. Comp.-Pract. E., 14 (2002), pp. 219–240, . · Zbl 1012.68146
[58] E. Schnetter, S. H. Hawley, and I. Hawke, Evolutions in 3D numerical relativity using fixed mesh refinement, Class. Quantum Grav., 21 (2004), pp. 1465–1488, . · Zbl 1047.83002
[59] M. Schönherr, K. Kucher, M. Geier, M. Stiebler, S. Freudiger, and M. Krafczyk, Multi-thread implementations of the lattice Boltzmann method on non-uniform grids for CPUs and GPUs, Comput. Math. Appl., 61 (2011), pp. 3730–3743, .
[60] F. Schornbaum and U. Rüde, Massively parallel algorithms for the lattice Boltzmann method on nonuniform grids, SIAM J. Sci. Comput., 38 (2016), pp. C96–C126, .
[61] The Enzo Project, (accessed 2018-04-15).
[62] J. Tölke, S. Freudiger, and M. Krafczyk, An adaptive scheme using hierarchical grids for lattice Boltzmann multi-phase flow simulations, Comput. & Fluids, 35 (2006), pp. 820–830, . · Zbl 1177.76332
[63] Uintah, (accessed 2018-04-15).
[64] waLBerla, (accessed 2018-04-15).
[65] Z. Yu and L.-S. Fan, An interaction potential based lattice Boltzmann method with adaptive mesh refinement (AMR) for two-phase flow simulation, J. Comput. Phys., 228 (2009), pp. 6456–6478, . · Zbl 1261.76048
[66] Zoltan, (accessed 2018-04-15).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.