Parallel and vectorized implementation of analytic evaluation of boundary integral operators. (English) Zbl 1403.65240

Summary: In this paper, we describe an efficient analytic evaluation of boundary integral operators. Firstly, we concentrate on a novel approach based on the simultaneous evaluation of all three linear shape functions defined on a boundary triangle. This results in a speedup of 2.35–3.15 times compared to the old approach of separate evaluations. In the second part we comment on the OpenMP parallelized and vectorized implementation of the suggested formulae. The employed code optimizations include techniques such as data alignment and padding, array-of-structures to structure-of-arrays data transformation, or unit-strided memory accesses. The presented scalability results, with respect both to the number of threads employed and the width of the SIMD register obtained on an Intel\(^{{\circledR}}\) Xeon\(^{\mathrm{TM}}\) processor and two generations of the Intel\(^{{\circledR}}\) Xeon Phi\(^{\mathrm{TM}}\) family (co)processors, validate the performed optimizations and show that vectorization needs to be an inherent part of modern scientific codes.


65N38 Boundary element methods for boundary value problems involving PDEs
65Y05 Parallel numerical computation
68W10 Parallel algorithms in computer science


Vc; vcl; BEM4I
