zbMATH — the first resource for mathematics

Experimenting task-based runtimes on a legacy computational fluid dynamics code with unstructured meshes. (English) Zbl 1410.76005
Summary: Advances in high performance computing hardware systems lead to higher levels of parallelism and optimizations in scientific applications and more specifically in computational fluid dynamics codes. To reduce the level of complexity that such architectures bring while attaining an acceptable amount of the parallelism offered by modern clusters, the task-based approach has gained a lot of popularity recently as it is expected to deliver portability and performance with a relatively simple programming model. In this paper, we review and present the process of adapting part of Code_Saturne, our legacy code at EDF R&D into a task-based form using the PaRSEC (parallel runtime scheduling and execution control) framework. First, we show the adaptation of our prime algorithm to a simpler form to remove part of the complexity of our code and then present its task-based implementation. We compare performance of various forms of our code and discuss the perks of task-based runtimes in terms of scalability, ease of incremental deployment in a legacy CFD code, and maintainability.
76-04 Software, source code, etc. for problems pertaining to fluid mechanics
Full Text: DOI
[1] Archambeau, F.; Méchitoua, N.; Sakiz, M., Code saturne: a finite volume code for the computation of turbulent incompressible flows-industrial applications, Int J Finite Volumes, 1, 1, (2004)
[2] Buttari, A.; Langou, J.; Kurzak, J.; Dongarra, J., Parallel tiled qr factorization for multicore architectures, Concurr Comput: Pract Experience, 20, 13, 1573-1590, (2008)
[3] Duran, A.; Ayguadé, E.; Badia, R. M.; Labarta, J.; Martinell, L.; Martorell, X., Ompss: a proposal for programming heterogeneous multi-core architectures, Parallel Process Lett, 21, 02, 173-193, (2011)
[4] Augonnet, C.; Thibault, S.; Namyst, R.; Wacrenier, P.-A., Starpu: a unified platform for task scheduling on heterogeneous multicore architectures, Concurr Comput: Pract Experience, 23, 2, 187-198, (2011)
[5] Chan, E.; Quintana-Orti, E. S.; Quintana-Orti, G.; Van De Geijn, R., Supermatrix out-of-order scheduling of matrix operations for smp and multi-core architectures, Proceedings of the nineteenth annual ACM symposium on parallel algorithms and architectures, 116-125, (2007), ACM
[6] Budimlić, Z.; Burke, M.; Cavé, V.; Knobe, K.; Lowney, G.; Newton, R., Concurrent collections, Sci Program, 18, 3-4, 203-217, (2010)
[7] Bosilca, G.; Bouteiller, A.; Danalis, A.; Herault, T.; Lemarinier, P.; Dongarra, J., Dague: a generic distributed DAG engine for high performance computing, Parallel Comput, 38, 1V2, 37-51, (2012)
[8] Quintana-Ortí, G.; Igual, F. D.; Quintana-Ortí, E. S.; Van de Geijn, R. A., Solving dense linear systems on platforms with multiple hardware accelerators, ACM sigplan notices, 44, 121-130, (2009), ACM
[9] Bosilca, G.; Bouteiller, A.; Danalis, A.; Faverge, M.; Haidar, A.; Herault, T., Flexible development of dense linear algebra algorithms on massively parallel architectures with dplasma, Parallel and distributed processing workshops and Phd forum (IPDPSW), 2011 IEEE international symposium on, 1432-1441, (2011), IEEE
[10] Agullo, E.; Augonnet, C.; Dongarra, J.; Faverge, M.; Langou, J.; Ltaief, H., Lu factorization for accelerator-based systems, Computer systems and applications (AICCSA), 2011 9th IEEE/ACS international conference on, 217-224, (2011), IEEE
[11] Code saturne 5.0 theory guide. URL http://code-saturne.org/cms/sites/default/files/docs/5.0/theory.pdf.
[12] Fournier, Y.; Bonelle, J.; Moulinec, C.; Shang, Z.; Sunderland, A.; Uribe, J., Optimizing code_saturne computations on petascale systems, Comput Fluids, 45, 1, 103-108, (2011) · Zbl 1429.76014
[13] Moustafa, S.; Faverge, M.; Plagne, L.; Ramet, P., 3d Cartesian transport sweep for massively parallel architectures with parsec, Parallel and distributed processing symposium (IPDPS), 2015 IEEE international, 581-590, (2015), IEEE
[14] Cosnard, M.; Loi, M., Automatic task graph generation techniques, System sciences, 1995. Proceedings of the twenty-eighth Hawaii international conference on, 2, 113-122, (1995), IEEE
[15] Cosnard, M.; Jeannot, E.; Yang, T., Compact dag representation and its symbolic scheduling, J Parallel Distrib Comput, 64, 8, 921-935, (2004) · Zbl 1068.68031
[16] Jeannot, E., Automatic multithreaded parallel program generation for message passing multiprocessors using parameterized task graphs, International conference parallel computing, (2001)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.