×

GPU-acceleration of stiffness matrix calculation and efficient initialization of EFG meshless methods. (English) Zbl 1286.65162

Summary: Meshless methods have a number of virtues in problems concerning crack growth and propagation, large displacements, strain localization and complex geometries, among other. Despite the fact that they do not rely on a mesh, meshless methods require a preliminary step for the identification of the correlation between nodes and Gauss points before building the stiffness matrix. This is implicitly performed with the mesh generation in FEM but must be explicitly done in EFG methods and can be time-consuming. Furthermore, the resulting matrices are more densely populated and the computational cost for the formulation and solution of the problem is much higher than the conventional FEM. This is mainly attributed to the vast increase in interactions between nodes and integration points due to their extended domains of influence. For these reasons, computing the stiffness matrix in EFG meshless methods is a very computationally demanding task which needs special attention in order to be affordable in real-world applications. In this paper, we address the pre-processing phase, dealing with the problem of defining the necessary correlations between nodes and Gauss points and between interacting nodes, as well as the computation of the stiffness matrix. A novel approach is proposed for the formulation of the stiffness matrix which exhibits several computational merits, one of which is its amenability to parallelization which allows the utilization of graphics processing units (GPUs) to accelerate computations.

MSC:

65N30 Finite element, Rayleigh-Ritz and Galerkin methods for boundary value problems involving PDEs
65Y10 Numerical algorithms for specific classes of architectures

Software:

FastPCG; CUDA; AGGJE
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Li, S.; Liu, W. K., Meshfree and particle methods and their applications, Appl. Mech. Rev., 55, 1-34 (2002)
[2] Nguyen, V. P.; Rabczuk, T.; Bordas, S.; Duflot, M., Meshless methods: A review and computer implementation aspects, Math. Comput. Simul., 79, 763-813 (2008) · Zbl 1152.74055
[3] Belytschko, T.; Krongauz, Y.; Organ, D.; Fleming, M.; Krysl, P., Meshless methods: An overview and recent developments, Comput. Methods Appl. Mech. Engrg., 139, 3-47 (1996) · Zbl 0891.73075
[4] Danielson, K. T.; Hao, S.; Liu, W. K.; Uras, R. A.; Li, S., Parallel computation of meshless methods for explicit dynamic analysis, Int. J. Numer. Methods Engrg., 47, 1323-1341 (2000) · Zbl 0981.74078
[5] Danielson, K. T.; Uras, R. A.; Adley, M. D.; Li, S., Large-scale application of some modern CSM methodologies by parallel computation, Adv. Engrg. Software, 31, 501-509 (2000) · Zbl 1003.68539
[6] Liu, G. R.; Dai, K. Y.; Nguyen, T. T., A smoothed finite element method for mechanics problems, Comput. Mech., 39, 859-877 (2007) · Zbl 1169.74047
[7] Wang, J. G.; Liu, G. R., A point interpolation meshless method based on radial basis functions, Int. J. Numer. Methods Engrg., 54, 1623-1648 (2002) · Zbl 1098.74741
[8] Gu, Y. T.; Liu, G. R., A coupled element free Galerkin/boundary element method for stress analysis of tow-dimensional solids, Comput. Methods Appl. Mech. Engrg., 190, 4405-4419 (2001)
[9] Yuan, W.-R.; Chen, P.; Liu, K.-X., High performance sparse solver for unsymmetrical linear equations with out-of-core strategies and its application on meshless methods, Appl. Math. Mech. (Engl. Ed.), 27, 1339-1348 (2006) · Zbl 1167.65354
[10] Wu, S. C.; Zhang, H. O.; Zheng, C.; Zhang, J. H., A high performance large sparse symmetric solver for the meshfree Galerkin method, Int. J. Comput. Methods, 5, 533-550 (2008) · Zbl 1264.80032
[11] Divo, E.; Kassab, A., Iterative domain decomposition meshless method modeling of incompressible viscous flows and conjugate heat transfer, Engrg. Anal. Bound. Elem., 30, 465-478 (2006) · Zbl 1195.76316
[12] Metsis, P.; Papadrakakis, M., Overlapping and non-overlapping domain decomposition methods for large-scale meshless EFG simulations, Comput. Methods Appl. Mech. Engrg., 229-232, 128-141 (2012) · Zbl 1253.74110
[13] Sanders, J.; Kandrot, E., CUDA by Example: An Introduction to General-Purpose GPU Programming (2010), Addison-Wesley Professional
[14] Kirk, D. B.; Hwu, W. W., Programming Massively Parallel Processors: A Hands-on Approach (2010), Morgan Kaufman
[17] Kampolis, I. C.; Trompoukis, X. S.; Asouti, V. G.; Giannakoglou, K. C., CFD-based analysis and two-level aerodynamic optimization on graphics processing units, Comput. Methods Appl. Mech. Engrg., 199, 712-722 (2010) · Zbl 1227.76056
[18] Elsen, E.; LeGresley, P.; Darve, E., Large calculation of the flow over a hypersonic vehicle using a GPU, J. Comput. Phys., 227, 10148-10161 (2008) · Zbl 1218.76035
[19] Thibault, J. C.; Senocak, I., Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms, J. Supercomput., 59, 693-719 (2012)
[20] De La Asunción, M.; Mantas, J. M.; Castro, M. J., Simulation of one-layer shallow water systems on multicore and CUDA architectures, J. Supercomput., 58, 206-214 (2011)
[21] Zhou, H.; Mo, G.; Wu, F.; Zhao, J.; Rui, M.; Cen, K., GPU implementation of lattice Boltzmann method for flows with curved boundaries, Comput. Methods Appl. Mech. Engrg., 225-228, 65-73 (2012) · Zbl 1253.76004
[22] Sunarso, A.; Tsuji, T.; Chono, S., GPU-accelerated molecular dynamics simulation for study of liquid crystalline flows, J. Comput. Phys., 229, 5486-5497 (2010) · Zbl 1193.82051
[23] Anderson, J. A.; Lorenz, C. D.; Travesset, A., General purpose molecular dynamics simulations fully implemented on graphics processing units, J. Comput. Phys., 227, 5342-5359 (2008) · Zbl 1148.81301
[24] Wadbro, E.; Berggren, M., Megapixel topology optimization on a graphics processing unit, SIAM Rev., 51, 707-721 (2009) · Zbl 1179.65079
[25] Komatitsch, D.; Erlebacher, G.; Göddeke, D.; Michéa, D., High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster, J. Comput. Phys., 229, 7692-7714 (2010) · Zbl 1194.86019
[26] Takahashi, T.; Hamada, T., GPU-accelerated boundary element method for Helmholtz’ equation in three dimensions, Int. J. Numer. Methods Engrg., 80, 1295-1321 (2009) · Zbl 1183.76829
[27] Joldes, G. R.; Wittek, A.; Miller, K., Real-time nonlinear finite element computations on GPU-Application to neurosurgical simulation, Comput. Methods Appl. Mech. Engrg., 199, 3305-3314 (2010) · Zbl 1225.92021
[28] Tomov, S.; Dongarra, J.; Baboulin, M., Towards dense linear algebra for hybrid GPU accelerated manycore systems, Parallel Comput., 36, 232-240 (2010) · Zbl 1204.68268
[29] Schenk, O.; Christen, M.; Burkhart, H., Algorithmic performance studies on graphics processing units, J. Parallel Distrib. Comput., 68, 1360-1369 (2008)
[30] Elble, J. M.; Sahinidis, N. V.; Vouzis, P., GPU computing with Kaczmarz’s and other iterative algorithms for linear systems, Parallel Comput., 36, 215-231 (2010) · Zbl 1204.68260
[32] Cevahir, A.; Nukada, A.; Matsuoka, S., High performance conjugate gradient solver on multi-GPU clusters using hypergraph partitioning, Comput. Sci.-Res. Develop., 25, 83-91 (2010)
[33] Papadrakakis, M.; Stavroulakis, G.; Karatarakis, A., A new era in scientific computing: Domain decomposition methods in hybrid CPU-GPU architectures, Comput. Methods Appl. Mech. Engrg., 200, 1490-1508 (2011) · Zbl 1228.74092
[34] Trobec, R.; Šterk, M.; Robič, B., Computational complexity and parallelization of the meshless local Petrov-Galerkin method, Comput. Struct., 87, 81-90 (2009)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.