zbMATH — the first resource for mathematics

A multi-platform scaling study for an openmp parallelization of a discontinuous Galerkin ocean model. (English) Zbl 1390.86025
Summary: We present a cross-platform scaling investigation for an OpenMP parallelization of UTBEST3D – a coastal and regional ocean code based on the discontinuous Galerkin finite element method. The study is conducted for a real life application on an unstructured computational mesh of the Northwest Atlantic with realistic topography and well resolved coast line on a broad selection of current computing platforms. Four numerical setups of increasing physical and computational complexity are used for comparison: barotropic with no vertical eddy viscosity, barotropic with an algebraic eddy viscosity parametrization, baroclinic with an algebraic eddy viscosity, and baroclinic with \(k\)-\(\varepsilon\) vertical turbulence closure. In addition to Intel Xeon and IBM Power6/PowerPC architectures, we also include Intel’s new MIC processor Xeon Phi in the evaluation. Good scalability is found across all platforms with Intel Xeon CPUs producing the best runtime results and Xeon Phi demonstrating the best parallel efficiency.
Reviewer: Reviewer (Berlin)

86A05 Hydrology, hydrography, oceanography
86-08 Computational methods for problems pertaining to geophysics
76M10 Finite element methods applied to problems in fluid mechanics
65Y05 Parallel numerical computation
65Y15 Packaged methods for numerical algorithms
Full Text: DOI
[1] Wallcraft AJ, Hurlburt HE, Townsend TL, Chassignet EP. 1/25 degree Atlantic ocean simulation using HYCOM. In: Users group conference; 2005. p. 222-5. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1592146>.
[2] Worley P, Levesque J. The performance evolution of the parallel ocean program on the Cray X1. In: Proceedings of the 46th Cray user group conference; 2004. p. 17-21. <http://www.csm.ornl.gov/∼worley/papers/CUG04_Worley_POP.pdf>.
[3] Cowles, G. W., Parallelization of the FVCOM coastal Ocean model, Int J High Perform Comput Appl, 22, 2, 177-193, (2008), <http://hpc.sagepub.com/content/22/2/177.short>
[4] Tanaka, S.; Bunya, S.; Westerink, J. J.; Dawson, C.; Luettich, R. A., Scalability of an unstructured grid continuous Galerkin based hurricane storm surge model, J Sci Comput, 46, 3, 329-358, (2011), <http://link.springer.com/10.1007/s10915-010-9402-1> · Zbl 1270.76038
[5] Dietrich, J. C.; Tanaka, S.; Westerink, J. J.; Dawson, C. N.; Luettich, R. A.; Zijlema, M., Performance of the unstructured-mesh, SWAN+ADCIRC model in computing hurricane waves and surge, J Sci Comput, 52, 2, 468-497, (2012), <http://link.springer.com/10.1007/s10915-011-9555-6> · Zbl 1254.86006
[6] Nair, R.; Choi, H.; Tufo, H., Computational aspects of a scalable high-order discontinuous Galerkin atmospheric dynamical core, Comput Fluids, 38, 2, 309-319, (2009), <http://linkinghub.elsevier.com/retrieve/pii/S004579300800056X> · Zbl 1237.76129
[7] Ringler, T.; Petersen, M.; Higdon, R., A multi-resolution approach to global Ocean modeling, Ocean Model, 69, 211-232, (2013), <http://linkinghub.elsevier.com/retrieve/pii/S1463500313000760>
[8] Sannino G, Artale V, Lanucara P. An hybrid OpenMP-MPI parallelization of the Princeton ocean model. In: Parallel computing: advances and current issues; 2001. p. 222-9. <http://utmea.enea.it/staff/sannino/Papers/ParCo/ParCoPOM.pdf>.
[9] Wang, G.; Qiao, F.; Xia, C., Parallelization of a coupled wave-circulation model and its application, Ocean Dyn, 60, 2, 331-339, (2010), <http://link.springer.com/10.1007/s10236-010-0274-6>
[10] Cordoba, M. L.; Dopico, a. G.; Garcia, M. I.; Rosales, F.; Arnaiz, J.; Bermejo, R., Efficient parallelization of a regional ocean model for the western Mediterranean sea, Int J High Perform Comput Appl, 28, 3, 368-383, (2014), <http://hpc.sagepub.com/cgi/doi/10.1177/1094342013512344>
[11] Barker K, Kerbyson D. A performance model and scalability analysis of the HYCOM ocean simulation application. In: IASTED international conference on parallel and distributed computing; 2005. <http://www.actapress.com/Abstract.aspx?paperId=22373>.
[12] Kerbyson, D. J., A performance model of the parallel Ocean program, Int J High Perform Comput Appl, 19, 3, 261-276, (2005), <http://hpc.sagepub.com/content/19/3/261.short>
[13] Mak J, Choboter P, Lupo C. Numerical ocean modeling and simulation with CUDA. In: OCEANS 2011; 2011. <http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6107199>.
[14] Gheller C. Refactoring and Algorithm Re-engineering Guides and Reports; 2013. <http://www.prace-project.eu/IMG/pdf/d8.2.pdf>.
[15] Heinecke, A.; Klemm, M.; Bungartz, H. J., From GPGPU to many-core: NVIDIA Fermi and intel many integrated core architecture, Comput Sci Eng, 14, 2, 78-83, (2012), <http://scitation.aip.org/content/aip/journal/cise/14/2/10.1109/MCSE.2012.23>
[16] Schulz KW, Ulerich R, Malaya N, Bauman PT, Stogner R, Simmons C. Early experiences porting scientific applications to the many integrated core (MIC) platform. In: TACC-intel highly parallel computing symposium. Mic; 2012. <http://users.ices.utexas.edu/∼rhys/papers/SchulzPS2cTIHCPS12.pdf>
[17] Schmidl, D.; Cramer, T.; Wienke, S., Assessing the performance of openmp programs on the intel xeon phi, (Euro-Par 2013 parallel processing, Lecture notes in computer science, vol. 8097, (2013), Springer), 547-558, <http://link.springer.com/10.1007/978-3-642-40047-6>
[18] Guo X. Report on Application Enabling for Capability Science in the MIC Architecture Final; 2013. <http://www.prace-project.eu/IMG/pdf/d7.1.3_1ip.pdf>.
[19] Aizinger, V.; Proft, J.; Dawson, C.; Pothina, D.; Negusse, S., A three-dimensional discontinuous Galerkin model applied to the baroclinic simulation of corpus christi bay, Ocean Dyn, 63, 1, 89-113, (2013), <http://link.springer.com/10.1007/s10236-012-0579-8>
[20] Klinger BA. Density of Seawater. <http://mason.gmu.edu/∼bklinger/seawater.pdf>.
[21] Davies, A. M., A three-dimensional model of the northwest European continental shelf, with application to the M4 tide, J Phys Oceanogr, 16, 5, 797-813, (1986), <http://dx.doi.org/10.1175/1520-0485(1986)016<0797:ATDMOT>2.0.CO;2>
[22] Umlauf, L.; Burchard, H., A generic length-scale equation for geophysical turbulence models, J Mar Res, 61, 2, 235-265, (2003), <http://www.ingentaselect.com/rpsv/cgi-bin/cgi?ini=xref&body=linker&reqdoi=10.1357/002224003322005087>
[23] Warner, J. C.; Sherwood, C. R.; Arango, H. G.; Signell, R. P., Performance of four turbulence closure models implemented using a generic length scale method, Ocean Model, 8, 1-2, 81-113, (2005), <http://linkinghub.elsevier.com/retrieve/pii/S1463500303000702>
[24] Mellor, G. L.; Yamada, T., Development of a turbulence closure model for geophysical fluid problems, Rev Geophys, 20, 4, 851-875, (1982), <http://doi.wiley.com/10.1029/RG020i004p00851>
[25] Cockburn, B.; Shu, C. W., The local discontinuous Galerkin method for time-dependent convection-diffusion systems, SIAM J Numer Anal, 35, 6, 2440-2463, (1998), <http://epubs.siam.org/doi/abs/10.1137/S0036142997316712> · Zbl 0927.65118
[26] Cockburn, B.; Shu, C. W., TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws. II. general framework, Math Comput, 52, 186, 411-435, (1989), <http://www.ams.org/jourcgi/jour-getitem?pii=S0025-5718-1989-0983311-4> · Zbl 0662.65083
[27] Kuzmin, D., A vertex-based hierarchical slope limiter for p-adaptive discontinuous Galerkin methods, J Comput Appl Math, 233, 12, 3077-3085, (2010), <http://linkinghub.elsevier.com/retrieve/pii/S0377042709003318> · Zbl 1252.76045
[28] Aizinger, V., A geometry independent slope limiter for the discontinuous Galerkin method, (Computational science and high performance computing IV, Notes on numerical fluid mechanics and multidisciplinary design, vol. 115, (2011), Springer), 207-217, <http://link.springer.com/10.1007/978-3-642-17770-5>
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.