FFT Explained: Why the Fast Fourier Transform Changed the World
The Cooley-Tukey FFT algorithm reduces the Discrete Fourier Transform from O(N²) multiplications to O(N log N) — a speedup that made real-time audio processing, JPEG compression, wireless modulation, and radar signal analysis computationally feasible. The algorithm is 60 years old, but understanding why it works is still one of the most elegant insights in computer science.
1. The Discrete Fourier Transform
The DFT converts a sequence of N time-domain samples into N frequency-domain coefficients. Each output X[k] measures how much of the frequency f_k = k/N (in cycles per sample) is present in the input.
X[k] = Σ_{n=0}^{N−1} x[n] · e^{−j 2π k n / N} k = 0, 1, ..., N−1
Inverse DFT:
x[n] = (1/N) · Σ_{k=0}^{N−1} X[k] · e^{+j 2π k n / N}
Complex exponential: e^{−j θ} = cos θ − j sin θ
X[k] interpreted: amplitude |X[k]| at frequency k/N cycles/sample
phase arg(X[k]) = phase shift of that frequency component
The DFT is used everywhere: audio spectrum analysis, noise filtering, image compression (DCT), convolution, correlation, and modulation schemes. The problem is computing it fast.
2. Why the DFT is O(N²)
To compute each X[k], you sum N terms. There are N values of k. So the total number of complex multiplications is N × N = N².
| N (samples) | DFT ops (N²) | FFT ops (N log₂N) | Speedup |
|---|---|---|---|
| 64 | 4 096 | 384 | ~10× |
| 1 024 | 1 048 576 | 10 240 | ~100× |
| 65 536 | 4 294 967 296 | 1 048 576 | ~4 000× |
| 1 048 576 (1M) | 10¹² | 20 971 520 | ~50 000× |
3. Divide and Conquer: The Even/Odd Split
The key insight (Cooley-Tukey, 1965; discovered earlier by Gauss in ~1805): split the N-point DFT into two N/2-point DFTs — one over the even-indexed samples, one over the odd-indexed.
Split into even (n = 2m) and odd (n = 2m+1) indices:
X[k] = Σ_{m=0}^{N/2−1} x[2m] · W_N^{2km} + Σ_{m=0}^{N/2−1} x[2m+1] · W_N^{k(2m+1)}
= Σ_{m=0}^{N/2−1} x[2m] · W_{N/2}^{km} + W_N^k · Σ_{m=0}^{N/2−1} x[2m+1] · W_{N/2}^{km}
= E[k] + W_N^k · O[k]
where E[k] = DFT of even samples (N/2-point DFT)
O[k] = DFT of odd samples (N/2-point DFT)
using W_N² = W_{N/2}
This halves the problem size. Because E[k] and O[k] are periodic with period N/2, we only need to compute them once and reuse for k and k + N/2:
X[k + N/2] = E[k] − W_N^k · O[k]
(because W_N^{k+N/2} = −W_N^k by periodicity)
Recursion: T(N) = 2·T(N/2) + O(N)
Solves to: T(N) = O(N log N)
4. The Butterfly Diagram
The butterfly is the fundamental unit of the FFT — two inputs produce two outputs in one step. In an N=8 FFT, three stages of N/2 = 4 butterflies compute the full transform.
Each "X" crossing is a butterfly operation. In an N-point FFT, there are log₂N stages, each with N/2 butterflies — giving N/2 × log₂N total butterfly operations.
5. Twiddle Factors and Periodicity
The term W_N^k = e^{−j2πk/N} is called the twiddle factor (or rotation factor). The key symmetry that makes FFT efficient:
Conjugate sym: W_N^{k+N/2} = −W_N^k (180° rotation → negation)
Reduction: W_N^{2k} = W_{N/2}^k (half-size twiddle)
First 4 twiddles (N=8):
W₈⁰ = 1 (no rotation)
W₈¹ = (√2/2)(1−j) ≈ 0.707 − 0.707j (−45°)
W₈² = −j = (0−1j) (−90°)
W₈³ ≈ −0.707 − 0.707j (−135°)
The −W_N^k = W_N^{k+N/2} symmetry means each butterfly reuses the same twiddle for two outputs — halving the number of multiplications again.
6. Bit-Reversal Permutation
The Cooley-Tukey algorithm naturally produces output in a scrambled order. For in-place computation, the input must first be reordered using bit-reversal permutation: the index n is replaced by its binary digits reversed.
Index Binary Reversed New index
0 000 000 0
1 001 100 4
2 010 010 2
3 011 110 6
4 100 001 1
5 101 101 5
6 110 011 3
7 111 111 7
Swap: (0↔0), (1↔4), (2↔2), (3↔6), (4↔1)*, (5↔5), (6↔3)*, (7↔7)
(* already handled by the reverse swap — only swap when rev > i to avoid double-swap)
7. JavaScript FFT Implementation
A clean iterative (in-place) Cooley-Tukey radix-2 FFT in JavaScript:
8. Applications: Audio, Images, OFDM
Audio Spectrum and Filtering
Audio processing uses FFT continuously: take N samples (window), FFT → modify frequency bins (equaliser, noise cancellation, pitch detection) → IFFT → overlap-add to reconstruct. Real-time audio at 44.1 kHz uses N = 512–4096 point FFTs refreshed 10–100 times per second.
Fast Convolution O(N log N)
The convolution theorem: pointwise multiplication in frequency domain = convolution in time domain. Convolving two N-point signals directly: O(N²). Via FFT → multiply → IFFT: O(N log N). Used in: audio reverb (convolution reverb with impulse responses), image blurring, polynomial multiplication.
(x * h)[n] = IFFT{ FFT{x} · FFT{h} }
Complexity: 3 × O(N log N) vs O(N²) direct
Crossover: FFT faster when N > ~32 samples
JPEG: The 2D DCT
JPEG compression uses the 2D Discrete Cosine Transform (DCT-II) on 8×8 blocks. The DCT is a real-valued specialisation of the DFT (only cosine basis, no sine). Most energy concentrates in the low-frequency DCT coefficients → quantise high-freq coefficients aggressively. The 8×8 2D-DCT is computed via two 1D DCTs (row then column) — each O(8 log 8) ≈ O(24).
OFDM Wireless Modulation
OFDM (Orthogonal Frequency-Division Multiplexing) — used in Wi-Fi, 4G/5G, DAB radio — encodes data on N orthogonal subcarriers. The IFFT converts frequency-domain symbols (QAM) into a time-domain waveform for transmission. The receiver FFTs the received signal back to frequency domain. N = 64 (Wi-Fi 802.11a/g), 2048 (LTE), 4096 (5G NR).
| Application | Typical N | Domain |
|---|---|---|
| Audio spectrum display | 512–4096 | 1D FFT, real input |
| MP3 / AAC compression | 512–2048 (MDCT) | Modified DCT |
| JPEG compression | 8×8 blocks (DCT) | 2D DCT |
| MRI image reconstruction | 256²–1024² | 2D FFT (k-space) |
| Radar pulse compression | 512–65536 | 1D FFT matched filter |
| 5G NR OFDM | 128–4096 | IFFT for modulation |
| Polynomial multiplication | Any power of 2 | 1D FFT (NTT for exact) |