×

Algorithm 967: A distributed-memory fast multipole method for volume potentials. (English) Zbl 1371.65125


MSC:

65N38 Boundary element methods for boundary value problems involving PDEs
65Y05 Parallel numerical computation
65N80 Fundamental solutions, Green’s function methods, etc. for boundary value problems involving PDEs
35J05 Laplace operator, Helmholtz equation (reduced wave equation), Poisson equation
35Q30 Navier-Stokes equations
70F10 \(n\)-body problems
65N75 Probabilistic methods, particle methods, etc. for boundary value problems involving PDEs
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Paul Bode and Jeremiah P. Ostriker. 2008. Tree particle-mesh: An adaptive, efficient, and parallel code for collisionless cosmological simulation. The Astrophysical Journal Supplement Series 145, 1 (2008), 1.
[2] Carsten Burstedde, Georg Stadler, Laura Alisic, Lucas C. Wilcox, Eh Tan, Michael Gurnis, and Omar Ghattas. 2013. Large-scale adaptive mantle convection simulation. Geophysical Journal International 192, 3 (2013), 889–906. · doi:10.1093/gji/ggs070
[3] Carsten Burstedde, Lucas C. Wilcox, and Omar Ghattas. 2011. p4est: Scalable algorithms for parallel adaptive mesh refinement on forests of octrees. SIAM Journal on Scientific Computing 33, 3 (2011), 1103–1133. · Zbl 1230.65106 · doi:10.1137/100791634
[4] Aparna Chandramowlishwaran, Kamesh Madduri, and Richard Vuduc. 2010. Diagnosis, tuning, and redesign for multicore performance: A case study of the fast multipole method. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, 1–12. · doi:10.1109/SC.2010.19
[5] A. J. Christlieb, R. Krasny, I. D. Boyd, J. Emhoff, and J. Verboncoeur. 2005. Grid-free plasma simulations. In IEEE Conference Record-Abstracts of the IEEE International Conference on Plasma Science (ICOPS’05). IEEE, 246–246. · doi:10.1109/PLASMA.2005.359318
[6] David Colton and Rainer Kress. 1998. Inverse Acoustic and Electromagnetic Scattering Theory, 2nd Edition. Springer, Berlin, Germany. · Zbl 0893.35138 · doi:10.1007/978-3-662-03537-5
[7] G. H. Cottet and P. D. Koumoutsakos. 2000. Vortex Methods: Theory and Practice. Cambridge University Press, Cambridge, UK. · Zbl 1140.76002 · doi:10.1017/CBO9780511526442
[8] Andreas Dedner, Friedemann Kemm, Dietmar Kröner, C.-D. Munz, Thomas Schnitzer, and Matthias Wesenberg. 2002. Hyperbolic divergence cleaning for the MHD equations. Journal of Computational Physics 175, 2 (2002), 645–673. · Zbl 1059.76040 · doi:10.1006/jcph.2001.6961
[9] M. Doi and S. F. Edwards. 1988. The Theory of Polymer Dynamics. Vol. 73. Oxford University Press, Oxford, UK.
[10] Ron O. Dror, J. P. Grossman, Kenneth M. Mackenzie, Brian Towles, Edmond Chow, John K. Salmon, Cliff Young, Joseph A. Bank, Brannon Batson, Martin M. Deneroff, Jeffrey S. Kuskin, Richard H. Larson, Mark A. Moraes, and David E. Shaw. 2010. Exploiting 162-nanosecond end-to-end communication latency on anton. In 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1–12. DOI:http://dx.doi.org/10.1109/SC.2010.23 · doi:10.1109/SC.2010.23
[11] Michael G. Duffy. 1982. Quadrature over a pyramid or cube of integrands with a singularity at a vertex. SIAM Journal on Numerical Analysis 19, 6 (1982), 1260–1262. · Zbl 0493.65011 · doi:10.1137/0719090
[12] Frank Ethridge and Leslie Greengard. 2001. A new fast-multipole accelerated Poisson solver in two dimensions. SIAM Journal on Scientific Computing 23, 3 (2001), 741–760. · Zbl 1002.65131 · doi:10.1137/S1064827500369967
[13] G. I. Fann, R. J. Harrison, G. Beylkin, J. Jia, R. Hartman-Baker, W. A. Shelton, and S. Sugiki. 2007. MADNESS applied to density functional theory in chemistry and nuclear physics. In Journal of Physics: Conference Series, Vol. 78. IOP Publishing, 012018. · doi:10.1088/1742-6596/78/1/012018
[14] A. Grama, A. Gupta, G. Karypis, and V. Kumar. 2003. An Introduction to Parallel Computing: Design and Analysis of Algorithms (second ed.). Addison Wesley, Reading, Mass., USA. · Zbl 0861.68040
[15] Max D. Gunzburger. 1989. Finite Element for Viscous Incompressible Flows. Academic Press. · Zbl 0697.76031
[16] Max D. Gunzburger and Roy A. Nicolaides. 1993. Incompressible Computational Fluid Dynamics. Cambridge University Press, Cambridge, UK. · Zbl 1190.76004 · doi:10.1017/CBO9780511574856
[17] T. Hamada, T. Narumi, R. Yokota, K. Yasuoka, K. Nitadori, and M. Taiji. 2009. 42 TFlops hierarchical N-body simulations on GPUs with applications in both astrophysics and turbulence. In Proceedings of SC09 (The SCxy Conference series). ACM/IEEE, Portland, OR.
[18] Robert J. Harrison, George I. Fann, Takeshi Yanai, Zhengting Gan, and Gregory Beylkin. 2004. Multiresolution quantum chemistry: Basic theory and initial applications. The Journal of Chemical Physics 121, 23 (2004), 11587–11598. · doi:10.1063/1.1791051
[19] Qi Hu, Nail A. Gumerov, and Ramani Duraiswami. 2011. Scalable fast multipole methods on distributed heterogeneous architectures. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 36. · doi:10.1145/2063384.2063432
[20] Tobin Isaac, Carsten Burstedde, and Omar Ghattas. 2012. Low-cost parallel algorithms for 2: 1 octree balance. In Proceedings of the IEEE 26th International Symposium on Parallel & Distributed Processing Symposium (IPDPS’12). IEEE, 426–437.
[21] Pritish Jetley, Lukasz Wesolowski, Filippo Gioachin, Laxmikant V. Kalé, and Thomas R. Quinn. 2010. Scaling hierarchical N-body simulations on GPU clusters. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society, 1–11. · doi:10.1109/SC.2010.49
[22] A. V. Kravtsov, A. A. Klypin, and A. M. Khokhlov. 1997. Adaptive refinement tree: A new high-resolution N-body code for cosmological simulations. The Astrophysical Journal Supplement Series 111 (1997), 73–94. · doi:10.1086/313015
[23] Harper Langston, Leslie Greengard, and Denis Zorin. 2011. A free-space adaptive FMM-based PDE solver in three dimensions. Communications in Applied Mathematics and Computational Science 6, 1 (2011), 79–122. · Zbl 1230.65131 · doi:10.2140/camcos.2011.6.79
[24] Ilya Lashuk, Aparna Chandramowlishwaran, Harper Langston, Tuan-Anh Nguyen, Rahul Sampath, Aashay Shringarpure, Richard Vuduc, Lexing Ying, Denis Zorin, and George Biros. 2012. A massively parallel adaptive fast multipole method on heterogeneous architectures. Communications of the ACM 55, 5 (May 2012), 101–109. · doi:10.1145/2160718.2160740
[25] Keith Lindsay and Robert Krasny. 2001. A particle method and adaptive treecode for vortex sheet motion in three-dimensional flow. Journal of Computational Physics 172, 2 (2001), 879–907. · Zbl 1002.76093 · doi:10.1006/jcph.2001.6862
[26] James W. Lottes and Paul F. Fischer. 2005. Hybrid multigrid/Schwarz algorithms for the spectral element method. Journal of Scientific Computing 24, 1 (2005), 45–78. · Zbl 1078.65570 · doi:10.1007/s10915-004-4787-3
[27] Dhairya Malhotra and George Biros. 2015. PVFMM: A parallel kernel independent FMM for particle and volume potentials. Communications in Computational Physics 18, 3 (2015), 808–830. · Zbl 1388.65169 · doi:10.4208/cicp.020215.150515sw
[28] Peter McCorquodale, Phillip Colella, Gregory T. Balls, and Scott B. Baden. 2006. A local corrections algorithm for solving Poisson’s equation in three dimensions. Communications in Applied Mathematics and Computational Science 2, LBNL–62377 (2006). · Zbl 1133.65106
[29] Matthias Messner, Bérenger Bramas, Olivier Coulaud, and Eric Darve. 2012. Optimized M2L kernels for the Chebyshev interpolation based fast multipole method. arXiv preprint arXiv:1210.7292 (2012).
[30] Akira Nukada, Kento Sato, and Satoshi Matsuoka. 2012. Scalable multi-GPU 3-d FFT for TSUBAME 2.0 supercomputer. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’12). IEEE, 1–10. · doi:10.1109/SC.2012.100
[31] James C. Phillips, Gengbin Zheng, Sameer Kumar, and Laxmikant V. Kalé. 2002. NAMD: Biomolecular simulation on thousands of processors. In Proceedings of Supercomputing (The SCxy Conference Series). ACM/IEEE, Baltimore, MD. · doi:10.1109/SC.2002.10019
[32] Joel R. Phillips and Jacob K. White. 1997. A precorrected-FFT method for electorstatic analysis of complicated 3-d structures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 16, 10 (1997), 1059–1072. · Zbl 05449263 · doi:10.1109/43.662670
[33] Costas Pozrikidis. 2001. Interfacial dynamics for Stokes flow. Journal of Computational Physics 169 (2001), 250–301. · Zbl 1046.76012 · doi:10.1006/jcph.2000.6582
[34] I. F. Sbalzarini, J. H. Walther, M. Bergdorf, S. E. Hieber, E. M. Kotsalis, and P. Koumoutsakos. 2006. PPM–A highly efficient parallel particle–mesh library for the simulation of continuum systems. Journal of Computational Physics 215, 2 (2006), 566–588. · Zbl 1173.76398 · doi:10.1016/j.jcp.2005.11.017
[35] V. Springel, N. Yoshida, and S. D. M. White. 2001. GADGET: A code for collisionless and gasdynamical cosmological simulations. New Astronomy 6, 2 (2001), 79–117. · doi:10.1016/S1384-1076(01)00042-2
[36] Hari Sundar, George Biros, Carsten Burstedde, Johann Rudi, Omar Ghattas, and Georg Stadler. 2012. Parallel geometric-algebraic multigrid on unstructured forests of octrees. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE Computer Society Press, 43. · doi:10.1109/SC.2012.91
[37] Hari Sundar, Dhairya Malhotra, and George Biros. 2013. HykSort: A new variant of hypercube quicksort on distributed memory architectures. In ICS’13: Proceedings of the International Conference on Supercomputing, 2013. · doi:10.1145/2464996.2465442
[38] Hari Sundar, Rahul S. Sampath, and George Biros. 2008. Bottom-up construction and 2:1 balance refinement of linear octrees in parallel. SIAM Journal on Scientific Computing 30, 5 (2008), 2675–2708. DOI:http://dx.doi.org/10.1137/070681727 · Zbl 1186.68554 · doi:10.1137/070681727
[39] Toru Takahashi, Cris Cecka, William Fong, and Eric Darve. 2012. Optimizing the multipole-to-local operator in the fast multipole method for graphical processing units. International Journal of Numerical Methods in Engineering 89, 1 (2012), 105–133. · Zbl 1242.68364 · doi:10.1002/nme.3240
[40] Michael S. Warren and John K. Salmon. 1993. A parallel hashed oct-tree n-body algorithm. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing. ACM, 12–21. · doi:10.1145/169627.169640
[41] Lexing Ying, George Biros, and Denis Zorin. 2004. A kernel-independent adaptive fast multipole method in two and three dimensions. Journal of Computatioinal Physics 196, 2 (2004), 591–626. · Zbl 1053.65095 · doi:10.1016/j.jcp.2003.11.021
[42] Lexing Ying, George Biros, and Denis Zorin. 2006. A high-order 3d boundary integral equation solver for elliptic PDEs in smooth domains. Journal of Computational Physics 219, 1 (2006), 247–275. · Zbl 1105.65115 · doi:10.1016/j.jcp.2006.03.021
[43] Rio Yokota and Lorena A. Barba. 2012. A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems. International Journal of High Performance Computing Applications 26, 4 (2012), 337–346. · doi:10.1177/1094342011429952
[44] R. Yokota, J. P. Bardhan, M. G. Knepley, L. A. Barba, and T. Hamada. 2011. Biomolecular electrostatics using a fast multipole BEM on up to 512 GPUS and a billion unknowns. Computer Physics Communications (2011). · Zbl 1259.78044
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.