An algorithm for the machine calculation of complex Fourier series.

*(English)*Zbl 0127.09002The first review of 1965 simply reads “Ein Zeit und Speicherraum sparendes Verfahren zur elektronischen Berechnung komplexer Fourierreihen mit vorgegebenen Koeffizienten.”, i.e., “A time- and memory space-saving algorithm for the electronic computation of complex Fourier series with prescribed coefficients.” 45 year later, the result of the short note has grown into manifold applications.

The discrete Fourier transform with \(N\) degrees of freedom is equivalent to the evaluation of a polynomial

\[ f(z)=\sum\limits_{k=0}^{N-1} \alpha_k z^k. \]

at the \(N\) points \(1,\omega,\omega^2, \dots, \omega^{N-1}\) where \(\omega=e^{2 \pi i / N}\) is the unit root of \(z^N-1=0\). The straight-forward computation requires \(N^2\) multiplications.

Cooley noticed that the computing effort can be reduced to (a multiple of) \(N (r_1+r_2+ \cdots + r_n)\) if \(N=r_1\,r_2 \cdots r_n\). In particular, there is a reduction to \(N\log N\) if \(N=2^n\). Encouraged by Garwin, who in his own research was in need of a fast means to compute Fourier transforms, Cooley and Tukey provided a computer code of the algorithm that is now the famous Cooley-Tukey algorithm.

Now there is an uncountable number of papers on the algorithm, and it is even impossible to provide an exhaustive list of the bibliographic notes on it. Therefore the present review can contain only a personal choice of hints to the topic, and the same holds for the references. Today it is unbelievable that Garwin had to put some pressure on the authors before he could convince them to publish a paper on the algorithm.

After seeing the paper by Cooley and Tukey it is clear that Gauß knew the procedure already in 1805 and Runge in 1903 [cf. C.F. Gauß, Nachlass: “Theoria interpolationis methodo nova tractata”, Werke Band III, 265–327 (Königliche Gesellschaft der Wissenschaften, Göttingen, 1866), here p. 325, and M.T. Heideman, D.H. Johnson, C.S. Burrus, Arch. Hist. Exact Sci. 34, 265–277 (1985; Zbl 0577.01027)]. Gauß calculated a sine series for describing the motion of the asteroid Juno. It is also hard to read Runge’s discussion today. There are more predecessors, but in the pre-computer period there was not such a demand of algorithms for large scale problems.

The fast Fourier transform (FFT) is not only important for problems like signal processing which are directly related to the Fourier transform, it is important for any kind of numerical treatment of convolution problems. The fast computation of the product of large numbers and – in connection with this application – of polynomials of a high degree belong to the important applications.

The Cooley-Tukey algorithm reduces the effort from \(N^2\) to \(N \log N\). There are fast algorithms also for other numerical procedures that show that the asymptotic complexity may be smaller than expected from the knowledge of straight-forward procedures. Certainly Strassen and Winograd who investigated such complexity problems, were inspired by the FFT. We mention the fast computation of large matrices. Nevertheless, there is a difference. The fast Fourier transform is highly efficient not only for large \(N\), but also for small \(N\). Moreover, it is more robust against rounding errors than a simple summation in the straight-forward evaluation. Specifically, when multiplying large numbers, the computing time is only increased by a factor depending on \(\log\log n\) in order to cope with rounding errors.

In the case \(N=2^n\) one can explain the algorithm to students via a repeated use of a simple even-odd reduction. Starting with an \(N^2\) law, the solution of two problems of the size \(N/2\) requires \(2 (N/2)^2=N^2/2\) operations, and the repetition eventually leads to the \(N \log N\) law. Moreover the bit inversion that is found in the appropriate indexing is a good hint to the duality character.

Indeed, defining \(f_{\ell}:=f(\omega^{\ell})\) we want to calculate

\[ f_{\ell} = \sum_{k=0}^{N-1} \alpha_k \omega^{k{\ell}}. \]

The first reduction refers to

\[ \begin{aligned} f_{2{\ell}} &= \sum_{k=0}^{N/2-1} \Bigl(\alpha_k \omega^{2k{\ell}} +\alpha_{k+N/2} \omega^{2(k+N/2){\ell}} \Bigr) \\ &= \sum_{k=0}^{N/2-1} \beta_k (\omega^2)^{k{\ell}}, \\ & \text{where ~} \beta_k = \alpha_k + \alpha_{k+N/2}\,. \\ f_{2{\ell}+1} &= \sum_{k=0}^{N/2-1} \Bigl(\alpha_k \omega^{2k{\ell}} \omega^k \\ &\qquad +\alpha_{k+N/2} \omega^{2(k+N/2){\ell}} \omega^{k+N/2}\Bigr) \\ &= \sum_{k=0}^{N/2-1} \gamma_k (\omega^2)^{k{\ell}}, \\ & \text{where ~} \gamma_k = \omega^k(\alpha_k - \alpha_{k+N/2})\,.\end{aligned} \]

Since \(\omega^2\) is the unit root of \(z^{N/2}-1=0\), the values for even indices and for odd indices are indeed given in terms of a discrete Fourier transform of half the size. The computing effort for the reduction is proportional to \(N\). In the second step the procedure is repeated for two problems of size \(N/2\), and after \(\log_2 N\) steps with an \(O(N)\) effort we are done. The following table shows for \(N=8\) the repeated splitting into subproblems and how the indices of the function values are reordered during the \(3=\log_2 8\) steps. The indices in the third column are written as binary numbers. If we read the bits from the right to the left, we get the numbers \(0, 1,2,\ldots,7\) in the natural order and see the bit inversion.

\[ \begin{matrix} 0 & 0 & 0 & 000\\ 1 & 2 & 4 & 100\\ \multispan 4\hrulefill \\ 2 & 4 & 2 & 010 \\ 3 & 6 & 6 & 110 \\ \multispan 4\hrulefill \\ 4 & 1 & 1 & 001 \\ 5 & 3 & 5 & 101 \\ \multispan 4\hrulefill \\ 6 & 5 & 3 & 011 \\ 7 & 7 & 7 & 111 \\ \end{matrix} \]

In the case of a decomposition \(N=r_1\,r_2 \cdots r_n\) with factors \(r_i\not=2\), considerations from group theory yield a better understanding of the mathematics behind the Cooley-Tukey algorithm. From an algebraic point of view, the cyclic discrete Fourier transform can be explained in terms of group algebras of cyclic groups and their representations. The elements \(1,z,z^2, \dots, z^{n-1}\) form a \( \mathbb{C}\)-basis of the algebra \(\mathbb{C}[z]/(z^N-1)\), and this basis is a cyclic group of order \(N\). This concept can be generalized to fast convolution procedures in arbitrary finite groups. In this context we mention that the braid group was considered in connection with the fast multiplication of large matrices and its complexity. The literature on the algebraic and group theoretic aspects makes only a very small portion compared to the algorithmic aspects and the applications in signal theory.

The Cooley-Tukey algorithm is not only important for the applications in connection with the fast Fourier transform and convolution, it also gave rise to interesting mathematics.

The discrete Fourier transform with \(N\) degrees of freedom is equivalent to the evaluation of a polynomial

\[ f(z)=\sum\limits_{k=0}^{N-1} \alpha_k z^k. \]

at the \(N\) points \(1,\omega,\omega^2, \dots, \omega^{N-1}\) where \(\omega=e^{2 \pi i / N}\) is the unit root of \(z^N-1=0\). The straight-forward computation requires \(N^2\) multiplications.

Cooley noticed that the computing effort can be reduced to (a multiple of) \(N (r_1+r_2+ \cdots + r_n)\) if \(N=r_1\,r_2 \cdots r_n\). In particular, there is a reduction to \(N\log N\) if \(N=2^n\). Encouraged by Garwin, who in his own research was in need of a fast means to compute Fourier transforms, Cooley and Tukey provided a computer code of the algorithm that is now the famous Cooley-Tukey algorithm.

Now there is an uncountable number of papers on the algorithm, and it is even impossible to provide an exhaustive list of the bibliographic notes on it. Therefore the present review can contain only a personal choice of hints to the topic, and the same holds for the references. Today it is unbelievable that Garwin had to put some pressure on the authors before he could convince them to publish a paper on the algorithm.

After seeing the paper by Cooley and Tukey it is clear that Gauß knew the procedure already in 1805 and Runge in 1903 [cf. C.F. Gauß, Nachlass: “Theoria interpolationis methodo nova tractata”, Werke Band III, 265–327 (Königliche Gesellschaft der Wissenschaften, Göttingen, 1866), here p. 325, and M.T. Heideman, D.H. Johnson, C.S. Burrus, Arch. Hist. Exact Sci. 34, 265–277 (1985; Zbl 0577.01027)]. Gauß calculated a sine series for describing the motion of the asteroid Juno. It is also hard to read Runge’s discussion today. There are more predecessors, but in the pre-computer period there was not such a demand of algorithms for large scale problems.

The fast Fourier transform (FFT) is not only important for problems like signal processing which are directly related to the Fourier transform, it is important for any kind of numerical treatment of convolution problems. The fast computation of the product of large numbers and – in connection with this application – of polynomials of a high degree belong to the important applications.

The Cooley-Tukey algorithm reduces the effort from \(N^2\) to \(N \log N\). There are fast algorithms also for other numerical procedures that show that the asymptotic complexity may be smaller than expected from the knowledge of straight-forward procedures. Certainly Strassen and Winograd who investigated such complexity problems, were inspired by the FFT. We mention the fast computation of large matrices. Nevertheless, there is a difference. The fast Fourier transform is highly efficient not only for large \(N\), but also for small \(N\). Moreover, it is more robust against rounding errors than a simple summation in the straight-forward evaluation. Specifically, when multiplying large numbers, the computing time is only increased by a factor depending on \(\log\log n\) in order to cope with rounding errors.

In the case \(N=2^n\) one can explain the algorithm to students via a repeated use of a simple even-odd reduction. Starting with an \(N^2\) law, the solution of two problems of the size \(N/2\) requires \(2 (N/2)^2=N^2/2\) operations, and the repetition eventually leads to the \(N \log N\) law. Moreover the bit inversion that is found in the appropriate indexing is a good hint to the duality character.

Indeed, defining \(f_{\ell}:=f(\omega^{\ell})\) we want to calculate

\[ f_{\ell} = \sum_{k=0}^{N-1} \alpha_k \omega^{k{\ell}}. \]

The first reduction refers to

\[ \begin{aligned} f_{2{\ell}} &= \sum_{k=0}^{N/2-1} \Bigl(\alpha_k \omega^{2k{\ell}} +\alpha_{k+N/2} \omega^{2(k+N/2){\ell}} \Bigr) \\ &= \sum_{k=0}^{N/2-1} \beta_k (\omega^2)^{k{\ell}}, \\ & \text{where ~} \beta_k = \alpha_k + \alpha_{k+N/2}\,. \\ f_{2{\ell}+1} &= \sum_{k=0}^{N/2-1} \Bigl(\alpha_k \omega^{2k{\ell}} \omega^k \\ &\qquad +\alpha_{k+N/2} \omega^{2(k+N/2){\ell}} \omega^{k+N/2}\Bigr) \\ &= \sum_{k=0}^{N/2-1} \gamma_k (\omega^2)^{k{\ell}}, \\ & \text{where ~} \gamma_k = \omega^k(\alpha_k - \alpha_{k+N/2})\,.\end{aligned} \]

Since \(\omega^2\) is the unit root of \(z^{N/2}-1=0\), the values for even indices and for odd indices are indeed given in terms of a discrete Fourier transform of half the size. The computing effort for the reduction is proportional to \(N\). In the second step the procedure is repeated for two problems of size \(N/2\), and after \(\log_2 N\) steps with an \(O(N)\) effort we are done. The following table shows for \(N=8\) the repeated splitting into subproblems and how the indices of the function values are reordered during the \(3=\log_2 8\) steps. The indices in the third column are written as binary numbers. If we read the bits from the right to the left, we get the numbers \(0, 1,2,\ldots,7\) in the natural order and see the bit inversion.

\[ \begin{matrix} 0 & 0 & 0 & 000\\ 1 & 2 & 4 & 100\\ \multispan 4\hrulefill \\ 2 & 4 & 2 & 010 \\ 3 & 6 & 6 & 110 \\ \multispan 4\hrulefill \\ 4 & 1 & 1 & 001 \\ 5 & 3 & 5 & 101 \\ \multispan 4\hrulefill \\ 6 & 5 & 3 & 011 \\ 7 & 7 & 7 & 111 \\ \end{matrix} \]

In the case of a decomposition \(N=r_1\,r_2 \cdots r_n\) with factors \(r_i\not=2\), considerations from group theory yield a better understanding of the mathematics behind the Cooley-Tukey algorithm. From an algebraic point of view, the cyclic discrete Fourier transform can be explained in terms of group algebras of cyclic groups and their representations. The elements \(1,z,z^2, \dots, z^{n-1}\) form a \( \mathbb{C}\)-basis of the algebra \(\mathbb{C}[z]/(z^N-1)\), and this basis is a cyclic group of order \(N\). This concept can be generalized to fast convolution procedures in arbitrary finite groups. In this context we mention that the braid group was considered in connection with the fast multiplication of large matrices and its complexity. The literature on the algebraic and group theoretic aspects makes only a very small portion compared to the algorithmic aspects and the applications in signal theory.

The Cooley-Tukey algorithm is not only important for the applications in connection with the fast Fourier transform and convolution, it also gave rise to interesting mathematics.

##### MSC:

42-04 | Software, source code, etc. for problems pertaining to harmonic analysis on Euclidean spaces |

65E05 | General theory of numerical methods in complex analysis (potential theory, etc.) |

68W99 | Algorithms in computer science |

##### Keywords:

numerical analysis
PDF
BibTeX
XML
Cite

\textit{J. W. Cooley} and \textit{J. W. Tukey}, Math. Comput. 19, 297--301 (1965; Zbl 0127.09002)

Full Text:
DOI

##### References:

[1] | G. E. P. Box, L. R. Connor, W. R. Cousins, O. L. Davies , F. R. Hirnsworth & G. P. Silitto, The Design and Analysis of Industrial Experiments, Oliver & Boyd, Edinburgh, 1954. |

[2] | I. J. Good, ”The interaction algorithm and practical Fourier series,” J. Roy. Statist. Soc. Ser. B., v. 20, 1958, p. 361-372; Addendum, v. 22, 1960, p. 372-375. MR 21 #1674; MR 23 #A4231. · Zbl 0086.12403 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.