×

Performance of the 3D FFT on the 6D network torus QCDOC parallel supercomputer. (English) Zbl 1196.65209

Summary: QCDOC is a massively parallel supercomputer with tens of thousands of nodes distributed on a six-dimensional torus network. The 6D structure of the network provides the needed communication resources for many communication-intensive applications. In this paper, we present a parallel algorithm for three-dimensional Fast Fourier Transform and its implementation for a 4096-node QCDOC prototype. Two techniques have been used to increase its parallel performance: simultaneous multi-dimensional communication and communication-and-computation overlapping. Benchmarking experiments suggest that 3D FFTs of size \(128\times 128\times 128\) can scale well on such platforms up to 4096 nodes. Our performance results suggest stronger scalability on QCDOC than on IBM BlueGene/L supercomputer.

MSC:

65T50 Numerical methods for discrete and fast Fourier transforms
65Y99 Computer aspects of numerical algorithms
68W10 Parallel algorithms in computer science

Software:

FFTW; DL_POLY
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Buneman, O.; Dunn, D. A., Computer experiments in plasma physics, Sci. J. (U.K.), 2, 34 (1966)
[2] Taub, A. H.; Fernbach, S., Computers and their Role in the Physical Sciences (1970), Gordon and Breach: Gordon and Breach New York
[3] Simmerling, C.; Strockbine, B.; Roitberg, A. E., All-atom structure prediction and folding simulations of a stable protein, J. Am. Chem. Soc., 124, 11258 (2002)
[4] Kholmurodov, K.; Smith, W.; Yasuoka, K.; Darden, T.; Ebisuzaki, T., A smooth-particle mesh Ewald method for DL_POLY molecular dynamics simulation package on the Fujitsu VPP700, J. Comput. Chem., 21, 1187 (2000)
[5] Norberg, J.; Nilsson, L., On the truncation of long-range electrostatic interactions in DNA, Biophys. J., 79, 1537 (2000)
[6] Sagui, C.; Darden, T. A., Molecular dynamics simulations of biomolecules: Long-range electrostatic effects, Annu. Rev. Biophys. Biom., 28, 155 (1999)
[7] York, D. M.; Yang, W. T.; Lee, H.; Darden, T.; Pedersen, L. G., Toward the accurate modeling of DNA—the importance of long-range electrostatics, J. Am. Chem. Soc., 117, 5001 (1995)
[8] Darden, T. A.; Toukmaji, A.; Pedersen, L. G., Long-range electrostatic effects in biomolecular simulations, J. Chim. Phys. Pcb., 94, 1346 (1997)
[9] Leach, A. R., Molecular Modelling: Principles and Applications (2001), Prentice-Hall: Prentice-Hall Harlow, England; Reading, MA
[10] Hockney, R. W.; Eastwood, J. W., Computer Simulation using Particles (1981), McGraw-Hill International Book Co.: McGraw-Hill International Book Co. New York · Zbl 0662.76002
[11] Allen, M. P.; Tildesley, D. J., Computer Simulation of Liquids (1989), Clarendon Press, Oxford University Press: Clarendon Press, Oxford University Press Oxford [England] · Zbl 0703.68099
[12] Darden, T.; York, D.; Pedersen, L., Particle Mesh Ewald—an \(N \log(N)\) method for Ewald sums in large systems, J. Chem. Phys., 98, 10089 (1993)
[13] Deserno, M.; Holm, C., How to mesh up Ewald sums. I. A theoretical and numerical comparison of various particle mesh routines, Journal of Chemical Physics, 109, 7678 (1998)
[14] Deserno, M.; Holm, C., How to mesh up Ewald sums. II. An accurate error estimate for the particle-particle-particle-mesh algorithm, Journal of Chemical Physics, 109, 7694 (1998)
[15] Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G., A smooth particle mesh Ewald method, Journal of Chemical Physics, 103, 8577 (1995)
[16] Luty, B. A.; Davis, M. E.; Tironi, I. G.; Vangunsteren, W. F., A comparison of particle-particle, particle-mesh and Ewald methods for calculating electrostatic interactions in periodic molecular-systems, Molecular Simulation, 14, 11 (1994)
[17] Pollock, E. L.; Glosli, J., Comments on P(3)M, FMM, and the Ewald method for large periodic Coulombic systems, Computer Physics Communications, 95, 93 (1996) · Zbl 0921.65081
[18] Martyna, G. J.; Tuckerman, M. E., A reciprocal space based method for treating long range interactions in ab initio and force-field-based calculations in clusters, Journal of Chemical Physics, 110, 2810 (1999)
[19] Minary, P.; Morrone, J. A.; Yarne, D. A.; Tuckerman, M. E.; Martyna, G. J., Long range interactions on wires: A reciprocal space based formalism, Journal of Chemical Physics, 121, 11949 (2004)
[20] Minary, P.; Tuckerman, M. E.; Pihakari, K. A.; Martyna, G. J., A new reciprocal space based treatment of long range interactions on surfaces, Journal of Chemical Physics, 116, 5351 (2002)
[21] Bracewell, R. N., The Fourier Transform and its Applications (1986), McGraw-Hill: McGraw-Hill New York · Zbl 0068.16503
[22] Brigham, E. O., The Fast Fourier Transform (1974), Prentice-Hall: Prentice-Hall Englewood Cliffs, NJ · Zbl 0375.65052
[23] Edelman, A.; Mccorquodale, P.; Toledo, S., The future fast Fourier transform?, SIAM Journal on Scientific Computing, 20, 1094 (1999) · Zbl 0926.65145
[24] Frigo, M.; Johnson, S. G., The design and implementation of FFTW3, Proceedings of the IEEE, 93, 216 (2005)
[25] M. Frigo, S.G. Johnson, FFTW: An adaptive software architecture for the FFT, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 3, 1998, p. 1381; M. Frigo, S.G. Johnson, FFTW: An adaptive software architecture for the FFT, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 3, 1998, p. 1381
[26] Van Loan, C.; Society for Industrial and Applied Mathematics, Computational Frameworks for the Fast Fourier Transform (1992), SIAM: SIAM Philadelphia
[27] Boyle, P.; Chen, D.; Christ, N.; Clark, M.; Cohen, S.; Cristian, C.; Dong, Z.; Gara, A.; Joo, B.; Jung, C.; Kim, C.; Levkova, L.; Liao, X.; Liu, G.; Li, S.; Lin, H.; Mawhinney, R.; Ohta, S.; Petrov, K.; Wettig, T.; Yamaguchi, A., The QCDOC project, Nuclear Physics B, 140, Proc. Suppl., 169 (2005)
[28] Boyle, P. A.; Chen, D.; Christ, N. H.; Clark, M. A.; Cohen, S. D.; Cristian, C.; Dong, Z.; Gara, A.; Joo, B.; Jung, C.; Kim, C.; Levkova, L. A.; Liao, X.; Liu, G.; Mawhinney, R. D.; Ohta, S.; Petrov, K.; Wettig, T.; Yamaguchi, A., Overview of the QCDSP and QCDOC computers, IBM Journal of Research and Development, 49, 351 (2005)
[29] Chen, D.; Christ, N. H.; Cristian, C.; Dong, Z.; Gara, A.; Garg, K.; Joo, B.; Kim, C.; Levkova, L.; Liao, X.; Mawhinney, R. D.; Ohta, S.; Wettig, T., QCDOC: A 10-teraflops scale computer for lattice QCD, Nuclear Physics B, 94, Proc. Suppl., 825 (2001)
[30] P. Rissland, Y. Deng, Structure and Performance of molecular dynamics package MDoC on QCDOC, Parallel Computing, in press. Available online January 2007; P. Rissland, Y. Deng, Structure and Performance of molecular dynamics package MDoC on QCDOC, Parallel Computing, in press. Available online January 2007
[31] Haynes, P. D.; Cote, M., Parallel fast Fourier transforms for electronic structure calculations, Computer Physics Communications, 130, 130 (2000) · Zbl 0963.65151
[32] M. Frigo, S.G. Johnson, The Fastest Fourier transform in the west, Technical Report, 1997; M. Frigo, S.G. Johnson, The Fastest Fourier transform in the west, Technical Report, 1997
[33] Gara, A.; Blumrich, M. A.; Chen, D.; Chiu, G. L.T.; Coteus, P.; Giampapa, M. E.; Haring, R. A.; Heidelberger, P.; Hoenicke, D.; Kopcsay, G. V.; Liebsch, T. A.; Ohmacht, M.; Steinmacher-Burow, B. D.; Takken, T.; Vranas, P., Overview of the Blue Gene/L system architecture, IBM Journal of Research and Development, 49, 195 (2005)
[34] Adiga, N. R.; Blumrich, M. A.; Chen, D.; Coteus, P.; Gara, A.; Giampapa, M. E.; Heidelberger, P.; Singh, S.; Steinmacher-Burow, B. D.; Takken, T.; Tsao, M.; Vranas, P., Blue Gene/L torus interconnection network, IBM Journal of Research and Development, 49, 265 (2005)
[35] Eleftheriou, M.; Fitch, B. G.; Rayshubskiy, A.; Ward, T. J.C.; Germain, R. S., Scalable framework for 3D FFTs on the Blue Gene/L supercomputer: Implementation and early performance measurements, IBM Journal of Research and Development, 49, 457 (2005)
[36] Eleftheriou, M.; Moreira, J. E.; Fitch, B. G.; Germain, R. S., A volumetric FFT for BlueGene/L, High Performance Computing—HIPC 2003, 2913, 194 (2003)
[37] QMP, LQCD Message Passing API, version 2.0, 2004; QMP, LQCD Message Passing API, version 2.0, 2004
[38] Boyle, P. A.; Chen, D.; Christ, N. H.; Clark, M.; Cohen, S. D.; Cristian, C.; Dong, Z.; Gara, A.; Joo, B.; Jung, C.; Kim, C.; Levkova, L.; Liao, X.; Li, S.; Lin, H.; Liu, G.; Mawhinney, R. D.; Ohta, S.; Petrov, K.; Wettig, T.; Yamaguchi, A., The status of user software on QCDOC, Nuclear Physics B, 140, Proc. Suppl., 829 (2005)
[39] Kumar, V., Introduction to Parallel Computing: Design and Analysis of Algorithms (1994), Benjamin/Cummings Pub. Co.: Benjamin/Cummings Pub. Co. Redwood City, CA
[40] Fitch, B. G.; Germain, R. S.; Mendell, M.; Pitera, J.; Pitman, M.; Rayshubskiy, A.; Sham, Y.; Suits, F.; Swope, W.; Ward, T. J.C.; Zhestkov, Y.; Zhou, R., Blue Matter, an application framework for molecular simulation on Blue Gene, Journal of Parallel and Distributed Computing, 63, 759 (2003)
[41] B. Fang, Y. Deng, G.J. Martyna, A fine grained parallel smooth particle mesh Ewald algorithm for biophysical simulation studies: Application to the 6D torus QCDOC supercomputer, Comput. Phys. Comm. (2007), in press; B. Fang, Y. Deng, G.J. Martyna, A fine grained parallel smooth particle mesh Ewald algorithm for biophysical simulation studies: Application to the 6D torus QCDOC supercomputer, Comput. Phys. Comm. (2007), in press · Zbl 1196.82047
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.