Parallel and vectorized implementation of analytic evaluation of boundary integral operators. (English) Zbl 1403.65240

Summary: In this paper, we describe an efficient analytic evaluation of boundary integral operators. Firstly, we concentrate on a novel approach based on the simultaneous evaluation of all three linear shape functions defined on a boundary triangle. This results in a speedup of 2.35–3.15 times compared to the old approach of separate evaluations. In the second part we comment on the OpenMP parallelized and vectorized implementation of the suggested formulae. The employed code optimizations include techniques such as data alignment and padding, array-of-structures to structure-of-arrays data transformation, or unit-strided memory accesses. The presented scalability results, with respect both to the number of threads employed and the width of the SIMD register obtained on an Intel\(^{{\circledR}}\) Xeon\(^{\mathrm{TM}}\) processor and two generations of the Intel\(^{{\circledR}}\) Xeon Phi\(^{\mathrm{TM}}\) family (co)processors, validate the performed optimizations and show that vectorization needs to be an inherent part of modern scientific codes.


65N38 Boundary element methods for boundary value problems involving PDEs
65Y05 Parallel numerical computation
68W10 Parallel algorithms in computer science


Vc; vcl; BEM4I
Full Text: DOI


[1] Okon, E. E.; Harrington, R. F., The potential integral for a linear distribution over a triangular domain, Int J Numer Methods Eng, 18, 12, 1821-1828, (1982) · Zbl 0493.65010
[2] Medina, D. E.; Liggett, J. A., Exact integrals for three-dimensional boundary element potential problems, Comm Appl Numer Methods, 5, 8, 555-561, (1989) · Zbl 0684.65100
[3] Maischak, M., The analytical computation of the Galerkin elements for the Laplace, Lamé and Helmholtz equation in 3D-BEM, Technical Report, (2000), Universität Hannover
[4] Salvadori, A., Analytical integrations of hypersingular kernel in 3D BEM problems, Comput Meth Appl Mech Eng, 190, 31, 3957-3975, (2001) · Zbl 0987.65128
[5] Rjasanow, S.; Steinbach, O., The fast solution of boundary integral equations, Mathematical and analytical techniques with applications to engineering, (2007), Springer New York · Zbl 1119.65119
[6] Nintcheu Fata, S., Explicit expressions for 3D boundary integrals in potential theory, Int J Numer Methods Engrg, 78, 1, 32-47, (2009) · Zbl 1183.65155
[7] Salvadori, A., Analytical integrations in 3D BEM for elliptic problems: evaluation and implementation, Int J Numer Meth Engrg, 84, 5, 505-542, (2010) · Zbl 1202.65163
[8] Carley, M. J., Analytical formulae for potential integrals on triangles, ASME J Appl Mech, 80, 4, (2013)
[9] Mogilevskaya, S. G.; Nikolskiy, D. V., The use of complex integral representations for analytical evaluation of three-dimensional BEM integrals-potential and elasticity problems, Q J Mech Appl Math, 67, 3, 505-523, (2014) · Zbl 1302.74185
[10] Zapletal, J.; Bouchala, J., Effective semi-analytic integration for hypersingular Galerkin boundary integral equations for the Helmholtz equation in 3D, Appl Math, 59, 5, 527-542, (2014) · Zbl 1340.65282
[11] Fu, Z.; Chen, W.; Wen, P.; Zhang, C., Singular boundary method for wave propagation analysis in periodic structures, J Sound Vib, 425, 170-188, (2018)
[12] Lin, J.; Zhang, C.; Sun, L.; Lu, J., Simulation of seismic wave scattering by embedded cavities in an elastic half-plane using the novel singular boundary method, Adv Appl Math Mech, 10, 2, 322-342, (2018)
[13] Tang, Z.; Fu, Z.; Zheng, D.; Huang, J., Singular boundary method to simulate scattering of SH wave by the canyon topography, Adv Appl Math Mech, 10, 4, 912-924, (2018)
[14] Börm S., Christophersen S.. Approximation of BEM matrices using GPGPUs. 2015. ArXiv e-prints [Online; accessed 22/8/2018]; arXiv:1510.07244
[15] Harbrecht H., Zaspel P.. A scalable H-matrix approach for the solution of boundary integral equations on multi-GPU clusters. 2018;ArXiv e-prints [Online; accessed 22/8/2018]; arXiv:1806.11558
[16] Einkemmer, L., Evaluation of the intel xeon phi 7120 and NVIDIA K80 as accelerators for two-dimensional panel codes, PLOS ONE, 12, 6, 1-16, (2017)
[17] Banaś, K.; Krużel, F.; Bielański, J., Finite element numerical integration for first order approximations on multi- and many-core architectures, Comput Meth Appl Mech Eng, 305, 827-848, (2016)
[18] Szustak, L.; Rojek, K.; Olas, T.; Kuczynski, L.; Halbiniak, K.; Gepner, P., Adaptation of MPDATA heterogeneous stencil computation to intel xeon phi coprocessor, Sci Program, 2015, 10, (2015)
[19] Lastovetsky, A.; Szustak, L.; Wyrzykowski, R., Model-based optimization of EULAG kernel on intel xeon phi through load imbalancing, IEEE Trans Parallel Distrib Syst, 28, 3, 787-797, (2017)
[20] Farhan, M. A.A.; Kaushik, D. K.; Keyes, D. E., Unstructured computational aerodynamics on many integrated core architecture, Parallel Comput, 59, 97-118, (2016)
[21] Hadade, I.; di Mare, L., Modern multicore and manycore architectures: modelling, optimisation and benchmarking a multiblock CFD code, Comput Phys Commun, 205, 32-47, (2016)
[22] Reguly, I. Z.; László, E.; Mudalige, G. R.; Giles, M. B., Vectorizing unstructured mesh computations for many-core architectures, Concurr Comput Pract Exper, 28, 2, 557-577, (2016)
[23] Merta, M.; Riha, L.; Meca, O.; Markopoulos, A.; Brzobohaty, T.; Kozubek, T., Intel xeon phi acceleration of hybrid total FETI solver, Adv Eng Softw, 112, 124-135, (2017)
[24] Merta, M.; Zapletal, J., Acceleration of boundary element method by explicit vectorization, Adv Eng Softw, 86, 70-79, (2015)
[25] Kretz, M.; Lindenstruth, V., Vc: A C++ library for explicit vectorization, Softw Pract Experss, 42, 11, 1409-1430, (2012)
[26] Fog A.. C++ vector class library; 2017. [Online; accessed 22/8/2018]; http://www.agner.org/optimize/vectorclass.pdf.
[27] Zapletal, J.; Merta, M.; Malý, L., Boundary element quadrature schemes for multi- and many-core architectures, Comput Math Appl, 74, 1, 157-173, (2017) · Zbl 1375.65164
[28] OpenMP Architecture Review Board. OpenMP application program interface. 2013. [Online; accessed 22/8/2018]; www.openmp.org/mp-documents/openmp-4.5.pdf.
[29] Zammarchi, M.; Fantoni, F.; Salvadori, A.; Wawrzynek, P., High order boundary and finite elements for 3D fracture propagation in brittle materials, Comput Meth Appl Mech Eng, 315, 550-583, (2017)
[30] Bebendorf, M., Hierarchical matrices: a means to efficiently solve elliptic boundary value problems, Lecture Notes in Computational Science and Engineering, (2008), Springer · Zbl 1151.65090
[31] Of, G.; Steinbach, O., The all-floating boundary element tearing and interconnecting method, J Numer Math, 17, 4, 277-298, (2009) · Zbl 1423.74943
[32] Říha, L.; Brzobohatý, T.; Markopoulos, A.; Meca, O.; Kozubek, T., Massively parallel hybrid total FETI (HTFETI) solver, Proceedings of the Platform for Advanced Scientific Computing Conference. PASC ’16, 7:1-7:11, (2016), ACM New York, NY, USA
[33] Lukáš, D.; Kovář, P.; Kovářová, T.; Merta, M., A parallel fast boundary element method using cyclic graph decompositions, Numer Algor, 70, 4, 807-824, (2015) · Zbl 1332.65177
[34] Kravcenko, M.; Maly, L.; Merta, M.; Zapletal, J., Parallel assembly of ACA BEM matrices on xeon phi clusters, 10777 LNCS, 101-110, (2018)
[35] Steinbach, O., Numerical approximation methods for elliptic boundary value problems: finite and boundary elements, (2008), Springer · Zbl 1153.65302
[36] Erichsen, S.; Sauter, S. A., Efficient automatic quadrature in 3-d Galerkin BEM, Comput. Meth. Appl. Mech. Eng., 157, 3-4, 215-224, (1998) · Zbl 0943.65139
[37] Sauter, S. A.; Schwab, C., Boundary element methods, Springer Series in Computational Mathematics, (2011), Springer Berlin Heidelberg Berlin, Heidelberg · Zbl 1215.65183
[38] Bronstein, I.; Semendjajew, K.; Musiol, G.; Mühlig, H., Taschenbuch der Mathematik., (1997), Verlag Harri Deutsch Frankfurt am Main · Zbl 0997.00529
[39] Steinbach O. Galerkin- und Kollokations-Diskretisierungen für Randintegralgleichungen in 3D —Dokumentation—; 2004. Internal report.
[40] Fog A. Instruction Tables: Lists of Instruction Latencies, Throughputs and Micro-operation Breakdowns for Intel, AMD and VIA CPUs. Copenhagen University College of Engineering;[Online; accessed 22/8/2018]; http://www.agner.org/optimize/instruction_tables.pdf.
[41] Geva R. Code Modernization Best Practices: Multi-level Parallelism for Intel^{®} Xeon™ and Intel^{®} Xeon Phi™ Processors. 2015. http://software.intel.com/en-us/articles/idf15-webcast-code-modernization-best-practices [Online; accessed 11/5/2017];
[42] Merta M, Zapletal J. BEM4I. IT4Innovations National Supercomputing Center; VŠB - Technical University of Ostrava, Studentská 6231/1B, 708 33 Ostrava-Poruba, Czech Republic; 2013. [Online; accessed 22/8/2018]; http://bem4i.it4i.cz/.
[43] Radon, J., Zur mechanischen kubatur, Monatsh Math, 52, 4, 286-300, (1948) · Zbl 0031.31504
[44] Kupradze, V. D.; Gegelia, T. G.; Baseleisvili, M. O.; Burculadze, T. V., Three-dimensional problems of the mathematical theory of elasticity and thermoelasticity., North-Holland Series in applied Mathematics and Mechanics, 25, (1979), Oxford: North-Holland Publishing Company Amsterdam, New York · Zbl 0406.73001
[45] Of, G.; Steinbach, O.; Wendland, W. L., Applications of a fast multipole Galerkin boundary element method in linear elastostatics, Comput Vis Sci, 8, 3-4, 201-209, (2005)
[46] Bebendorf, M.; Grzhibovskis, R., Accelerating Galerkin BEM for linear elasticity using adaptive cross approximation, Math Methods Appl Sci, 29, 14, 1721-1747, (2006) · Zbl 1110.74054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.