zbMATH — the first resource for mathematics

A flexible framework for multidimensional DFTs. (English) Zbl 1451.65243
65T50 Numerical methods for discrete and fast Fourier transforms
65Y05 Parallel numerical computation
65Y10 Numerical algorithms for specific classes of architectures
Full Text: DOI
[1] Riken AICS. https://www.r-ccs.riken.jp/en/.
[2] J. Bruck, C.-T. Ho, S. Kipnis, E. Upfal, and D. Weathersby, Efficient algorithms for all-to-all communications in multiport message-passing systems, IEEE Trans. Parallel Distrib. Systems, 8 (1997), pp. 1143-1156.
[3] A. Canning, L. Wang, A. Williamson, and A. Zunger, Parallel empirical pseudopotential electronic structure calculations for million atom systems, J. Comput. Phys., 160 (2000), pp. 29-41. · Zbl 0963.65110
[4] J. Choi, J. J. Dongarra, R. Pozo, and D. W. Walker, ScaLAPACK: A scalable linear algebra library for distributed memory concurrent computers, in Proceedings of the Fourth Symposium on the Frontiers of Massively Parallel Computation, IEEE Computer Society, Los Alamitos, CA, 1992, pp. 120-127.
[5] J. W. Cooley and J. W. Tukey, An algorithm for the machine calculation of complex fourier series, Math. Comp., 19 (1965), pp. 297-301. · Zbl 0127.09002
[6] I. T. Foster and P. H. Worley, Parallel algorithms for the spectral transform method, SIAM J. Sci. Comput., 18 (1997), pp. 806-837. · Zbl 0872.65094
[7] F. Franchetti, Y. Voronenko, and M. Püschel, FFT program generation for shared memory: SMP and multicore, in Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, ACM, New York, 2006, pp. 115-es. · Zbl 1177.94043
[8] M. Frigo and S. G. Johnson, The design and implementation of FFTW \(3\), Proc. IEEE, 93 (2005), pp. 216-231.
[9] M. Frigo and S. G. Johnson, FFTW: Fastest fourier transform in the west, Astrophysics Source Code Library (2012).
[10] M. A. Inda and R. H. Bisseling, A simple and efficient parallel FFT algorithm using the BSP model, Parallel Comput., 27 (2001), pp. 1847-1878. · Zbl 0983.68248
[11] J. Johnson and X. Xu, A recursive implementation of the dimensionless FFT, in Acoustics, Speech, and Signal Processing, 2003 Proceedings, ICASSP’03, Vol. 2, IEEE, Piscataway, NJ, 2003, pp. 649-652.
[12] J. Jung, C. Kobayashi, T. Imamura, and Y. Sugita, Parallel implementation of \(3\) d FFT with volumetric decomposition schemes for efficient molecular dynamics simulations, Comput. Phys. Commun., 200 (2016), pp. 57-65. · Zbl 1352.65660
[13] R. A. Kendall, E. Aprà, D. E. Bernholdt, E. J. Bylaska, M. Dupuis, G. I. Fann, R. J. Harrison, J. Ju, J. A. Nichols, J. Nieplocha, T. P. Straatsma, T. L. Windus, and A. T. Wong, High performance computational chemistry: An overview of NWChem a distributed parallel application, Comput. Phys. Commun., 128 (2000), pp. 260-283. · Zbl 1002.81571
[14] R. A. Lebensohn, N-site modeling of a 3d viscoplastic polycrystal using fast Fourier transform, Acta Mater., 49 (2001), pp. 2723-2737.
[15] R. A. Lebensohn, A. K. Kanjarla, and P. Eisenlohr, An elasto-viscoplastic formulation based on fast Fourier transforms for the prediction of micromechanical fields in polycrystalline materials, Int. J. Plast., 32 (2012), pp. 59-69.
[16] S.-B. Lee, R. Lebensohn, and A. D. Rollett, Modeling the viscoplastic micromechanical response of two-phase materials using fast Fourier transforms, Int. J. Plast., 27 (2011), pp. 707-727. · Zbl 1405.74012
[17] OpenMP Architecture Review Board, OpenMP Application Program Interface Version 4.0, May 2018, https://www.openmp.org/.
[18] D. Pekurovsky, P3DFFT: A Framework for Parallel Computations of Fourier Transforms in Three Dimensions, SIAM J. Sci. Comput., 34 (2012), pp. C192-C209, https://doi.org/10.1137/11082748X. · Zbl 1253.65205
[19] S. Plimpton, R. Pollock, and M. Stevens, Particle-mesh Ewald and RRESPA for parallel molecular dynamics simulations, in Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, SIAM, Philadelphia, 1997.
[20] S. J. Plimpton, FFTs for (mostly) particle codes within the DOE exascale computing project, Technical report, SAND2017-127D1PE, Sandia National Laboratory, Albuquerque, NM, 2017.
[21] S. J. Plimpton and A. P. Thompson, Computational aspects of many-body potentials, MRS Bull., 37 (2012), pp. 513-521.
[22] D. T. Popovici, F. Franchetti, and T. M. Low, Mixed data layout kernels for vectorized complex arithmetic, in 2017 IEEE High Performance Extreme Computing Conference, HPEC 2017, Waltham, MA, 2017, IEEE, Piscataway, NJ, 2017, pp. 1-7.
[23] D. T. Popovici, T.-M. Low, and F. Franchetti, Large bandwidth-efficient FFTs on multicore and multi-socket systems, in IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE, Piscataway, NJ, 2018.
[24] J. Poulson, B. Marker, R. A. Van de Geijn, J. R. Hammond, and N. A. Romero, Elemental: A new framework for distributed memory dense matrix computations, ACM Trans. Math. Software, 39 (2013), 13. · Zbl 1295.65137
[25] B. Prisacari, G. Rodriguez, C. Minkenberg, and T. Hoefler, Bandwidth-optimal all-to-all exchanges in fat tree networks, in Proceedings of the 27th International ACM Conference on Supercomputing, ACM, New York, 2013, pp. 139-148.
[26] M. D. Schatz, Distributed Tensor Computations: Formalizing Distributions, Redistributions, and Algorithm Derivation, PhD thesis, University of Texas at Austin, Austin, TX, 2015.
[27] M. Snir, S. W. Otto, S. Huss-Lederman, D. W. Walker, and J. Dongarra, MPI: The Complete Reference, The MIT Press, Cambridge, MA, 1996.
[28] T. Straatsma, E. Bylaska, H. van Dam, N. Govind, W. de Jong, K. Kowalski, and M. Valiev, Advances in scalable computational chemistry: NWChem, in Annual Reports in Computational Chemistry, Vol. 7, Elsevier, Amsterdam, 2011, pp. 151-177.
[29] P. N. Swarztrauber, Multiprocessor FFTs, Parallel Comput., 5 (1987), pp. 197-210.
[30] R. A. Sweet, W. L. Briggs, S. Oliveira, J. L. Porsche, and T. Turnbull, FFTs and three-dimensional Poisson solvers for hypercubes, Parallel Comput., 17 (1991), pp. 121-131. · Zbl 0742.65074
[31] D. Takahashi, An implementation of parallel 3-D FFT with 2-D decomposition on a massively parallel cluster of multi-core processors, in Parallel Processing and Applied Mathematics, 8th International Conference, PPAM 2009, Wroclaw, Poland, Revised Selected Papers, Part I, Czestoschowa University of Technology, Czestoschowa, Poland, 2009, pp. 606-614.
[32] M. Valiev, E. J. Bylaska, N. Govind, K. Kowalski, T. P. Straatsma, H. J. Van Dam, D. Wang, J. Nieplocha, E. Apra, T. L. Windus, and W. A. de Jong, NWChem: a comprehensive and scalable open-source solution for large scale molecular simulations, Comput. Phys. Commun., 181 (2010), pp. 1477-1489. · Zbl 1216.81179
[33] C. Van Loan, Computational Frameworks for the Fast Fourier Transform, SIAM, Philadelphia, 1992. · Zbl 0757.65154
[34] J.-L. Vay, A. Almgren, J. Bell, L. Ge, D. Grote, M. Hogan, O. Kononenko, R. Lehe, A. Myers, C. Ng, J. Park, R. Ryne, O. Shapoval, M. Thévenet, and W. Zhang, Warp-X: A new exascale computing platform for beam-plasma simulations, Nucl. Instrum. Methods Phys. Res. Sect. A, 909 (2018), pp. 476-479.
[35] J.-L. Vay, I. Haber, and B. B. Godfrey, A domain decomposition method for pseudo-spectral electromagnetic simulations of plasmas, J. Comput. Phys., 243 (2013), pp. 260-268. · Zbl 1349.82126
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.