Wavelet Transform — The Language of Time and Frequency
The Fourier transform is one of the most powerful tools in mathematics — but it has a fundamental limitation: it tells you which frequencies a signal contains, but not when they occur. A symphony and a randomly scrambled version of that symphony have identical Fourier magnitude spectra. Wavelet transforms solve this by using oscillations that are localised in time, giving a picture of how frequency content evolves — from earthquake seismograms and financial time series to medical imaging and the JPEG2000 compression standard.
1. Heisenberg Uncertainty in Time-Frequency Analysis
The uncertainty principle is not only a statement about quantum mechanics — it is a fundamental mathematical theorem about any signal. For a signal f(t) with Fourier transform F(ω), define the time spread Δt and frequency spread Δω as the standard deviations of |f(t)|² and |F(ω)|²:
The equality holds only for Gaussian signals (the "optimal" time-frequency localisation). This has a direct implication for analysis: you cannot achieve perfect time resolution and perfect frequency resolution simultaneously.
The Short-Time Fourier Transform (STFT) addresses this by multiplying the signal by a sliding window g(t−τ) and computing the Fourier transform at each window position:
But the STFT has a crucial limitation: the window size is fixed. A short window gives good time resolution but poor frequency resolution; a long window gives good frequency resolution but poor time resolution. The same trade-off applies at all frequencies. Wavelet transforms escape this prison by using a variable window size that automatically adapts to the frequency being analysed.
2. What Is a Wavelet?
A wavelet is a function ψ(t) that satisfies two conditions:
The zero-mean condition (1) means the wavelet oscillates — it has positive and negative parts that cancel out. This is what makes it a "wavelet" (small wave) rather than a general function. The admissibility condition (3) ensures the transform is invertible.
From a mother wavelet ψ(t), a family of daughter wavelets is generated by scaling and translating:
The 1/√|a| factor ensures each daughter wavelet has the same energy as the mother. As a increases, the wavelet stretches out, becomes wider, and captures lower-frequency content. As a decreases, the wavelet compresses, becomes narrower, and captures higher-frequency content. This is the key: the analysis window automatically adapts to the frequency scale being studied.
3. The Continuous Wavelet Transform (CWT)
The Continuous Wavelet Transform of a signal f(t) is defined as the inner product of f with each daughter wavelet:
The CWT produces a 2D map W_f(a, b) — a scalogram — showing how much of the signal's energy at time b is at scale a (which corresponds inversely to frequency). A large |W_f(a, b)|² means the signal looks like ψ at scale a near time b.
The signal can be perfectly reconstructed from its CWT via the inverse CWT:
The CWT is highly redundant: a 1D signal of N samples is mapped to a 2D surface with many more values. This redundancy is useful for analysis and visualisation but wasteful for compression. For compression and fast computation, the Discrete Wavelet Transform is preferred.
Commonly Used Mother Wavelets for CWT
- Morlet wavelet: ψ(t) = π^(−1/4) e^(iω₀t) e^(−t²/2). A complex sinusoid modulated by a Gaussian. Optimal time-frequency localisation. Widely used in geophysics and neuroscience.
- Mexican hat (Ricker wavelet): ψ(t) = (2/√3)π^(−1/4)(1−t²)e^(−t²/2). The second derivative of a Gaussian. Real-valued, good for detecting ridges and edges.
- Complex Gaussian derivatives: Used in edge detection and image analysis.
4. The Haar Wavelet — Simplicity and Its Price
The Haar wavelet, first described by Alfred Haar in 1909, is the simplest possible orthonormal wavelet:
The corresponding scaling function (father wavelet) is the box function: φ(t) = 1 for 0 ≤ t < 1, 0 otherwise. The Haar transform of a discrete signal [x₁, x₂, x₃, x₄, ...] is computed recursively by taking pairwise averages (approximation coefficients) and differences (detail coefficients):
The Haar transform has O(N) computational complexity and exact integer arithmetic is possible. However, it has a significant weakness: Haar wavelets have only one vanishing moment, meaning they can represent constant functions exactly but not any polynomial of higher degree. The blocky Haar approximation introduces visible artefacts in image compression — the "staircase" effect at boundaries.
5. Daubechies Wavelets and Vanishing Moments
Ingrid Daubechies solved a central problem in 1988: how to construct compactly supported (finite-length) orthonormal wavelets with N vanishing moments. A wavelet ψ has N vanishing moments if:
This means the wavelet is "blind" to polynomial signals of degree up to N−1 — the wavelet coefficients of a polynomial signal of degree < N are exactly zero. More vanishing moments means:
- Smoother regions of a signal produce very small wavelet coefficients (good compression).
- Only discontinuities and fine texture produce large coefficients.
- The scaling function and wavelet are smoother (better frequency selectivity).
The Daubechies dbN wavelet has N vanishing moments and minimum support width 2N−1. Key members of the family:
- db1 (= Haar): 1 vanishing moment, support [0,1]. Discontinuous.
- db2 (= D4): 2 vanishing moments, support [0,3]. Slightly smooth.
- db4: 4 vanishing moments, support [0,7]. Used in JPEG2000 lossless.
- db8: 8 vanishing moments, support [0,15]. Excellent for seismic data.
- db20: 20 vanishing moments. Near-ideal frequency selectivity but support of length 39.
Daubechies wavelets are constructed by finding a polynomial in z whose coefficients are the lowpass filter coefficients h[n] satisfying orthogonality and vanishing moment conditions simultaneously. The filter coefficients have no closed-form expression for N ≥ 2 and must be computed numerically.
6. Multiresolution Analysis and the Fast Wavelet Transform
The theoretical foundation of the DWT is multiresolution analysis (MRA), developed by Stéphane Mallat and Yves Meyer around 1989. MRA defines a sequence of nested subspaces of L²(ℝ):
At each level j, the signal is split into a coarse approximation (projection onto V_j) and fine detail (projection onto W_j). This decomposition is implemented by the filter bank interpretation of the wavelet transform:
This recursion is the Mallat algorithm (fast wavelet transform). Each level halves the number of samples, so the total computation is: N + N/2 + N/4 + ... = 2N operations — O(N), faster than the FFT's O(N log N).
7. The Discrete Wavelet Transform (DWT)
The Discrete Wavelet Transform samples the CWT at a dyadic grid (a = 2^j, b = k · 2^j) and applies the Mallat algorithm recursively to J levels of decomposition. For a signal of length N = 2^J, the output is:
The DWT is perfectly invertible: the original signal is recovered exactly (in exact arithmetic) by applying the inverse filter bank recursively from the coarsest level. The perfect reconstruction property requires that the analysis filters h[n], g[n] and their synthesis counterparts h̃[n], g̃[n] satisfy specific conditions. For orthonormal wavelets (like Daubechies), the analysis and synthesis filters are identical.
Biorthogonal Wavelets
Orthonormal wavelets must be either symmetric (like Haar) or have very long support (a mathematical theorem). For image compression, symmetric wavelets are preferred because they produce symmetric boundary effects. Biorthogonal wavelets relax the orthogonality requirement to allow both symmetry and compact support: the analysis and synthesis filter pairs are different but still satisfy perfect reconstruction. The CDF 9/7 wavelet (9-tap analysis filter, 7-tap synthesis filter) used in JPEG2000 lossy compression is a biorthogonal wavelet.
8. JPEG2000 and Image Compression
JPEG2000 (ISO/IEC 15444-1, 2000) replaced the DCT-based JPEG standard with a wavelet-based compression pipeline that achieves superior quality, especially at high compression ratios. The encoding pipeline is:
- Colour space conversion: RGB → YCbCr (luminance + chrominance).
- 2D DWT: Apply separable 2D wavelet transform (rows then columns) to J = 5 levels, producing 3J + 1 = 16 subbands (LL₅, LH₅, HL₅, HH₅, LH₄, HL₄, HH₄, ..., LH₁, HL₁, HH₁).
- Quantisation: Divide each subband coefficient by a step size Δ = 2^(R−ε) (lossy only). Large step sizes remove fine detail; small step sizes preserve it.
- EBCOT entropy coding: Embedded Block Coding with Optimised Truncation — a context-adaptive arithmetic coder that produces an embedded bitstream: any prefix of the bitstream is a valid compressed image.
Key advantages of JPEG2000 over JPEG:
- No blocking artefacts: Wavelet basis functions span the whole image; there are no visible grid lines.
- Progressive transmission: The embedded bitstream allows the image to improve continuously as more data arrives — the same bitstream serves thumbnails and full resolution.
- Lossless and lossy in one format: The same codec handles both, enabling workflows where lossless archiving and lossy delivery come from the same file.
- Region of Interest (ROI) coding: A region of the image can be compressed at much higher quality than the background — critical for medical imaging.
- Scalable resolution: Decoders can stop early to reconstruct the image at half or quarter resolution without re-encoding.
JPEG2000 is the mandatory format for digital cinema (DCI), the standard in DICOM medical imaging, and the basis for the JPEG XS and High Throughput JPEG2000 (HTJ2K) standards used in broadcast production.
Frequently Asked Questions
What is a wavelet transform?
A wavelet transform decomposes a signal into components at different scales (frequencies) and different times simultaneously. Unlike the Fourier transform, which uses infinite sinusoids and gives global frequency information with no time localisation, wavelets use short oscillating functions (wavelets) that are concentrated in time, providing both time and frequency information about the signal.
Why can't the Fourier transform tell us when a frequency occurs?
The Fourier transform represents a signal as a sum of infinite sinusoids that extend across all time. A single sinusoidal component contributes to every time point equally. So while the Fourier transform tells you which frequencies are present in the entire signal, it gives no information about when those frequencies occur. A chirp (frequency increasing over time) and a random mix of the same frequencies would have the same Fourier magnitude spectrum.
What is the Heisenberg uncertainty principle in signal processing?
In signal processing, the Heisenberg uncertainty principle states that a signal cannot be simultaneously perfectly localised in both time and frequency. Mathematically: Δt × Δω ≥ 1/2, where Δt is the time spread and Δω is the frequency spread (standard deviations). Making a signal shorter in time (better time resolution) necessarily spreads it wider in frequency (worse frequency resolution) and vice versa. Wavelets navigate this trade-off by using adaptive resolution: wide at low frequencies, narrow at high frequencies.
What is the difference between CWT and DWT?
The Continuous Wavelet Transform (CWT) computes the correlation of a signal with a wavelet at every possible scale and every possible time position, producing a highly redundant 2D time-scale map. The Discrete Wavelet Transform (DWT) samples this map at a dyadic grid (scales and positions that are powers of 2), producing a compact, non-redundant representation that is perfectly invertible and computationally efficient via the fast lifting algorithm.
What is the Haar wavelet?
The Haar wavelet is the simplest possible wavelet: a square wave that is +1 on [0, 0.5), −1 on [0.5, 1), and 0 elsewhere. The Haar transform computes the difference and average of adjacent pairs of samples at each scale, building a hierarchical decomposition. It was first described by Alfred Haar in 1909 and is notable for having perfect time localisation but poor frequency localisation (the spectrum decays slowly).
What are Daubechies wavelets?
Daubechies wavelets (db1 = Haar, db2, db4, ... dbN) are a family of orthogonal wavelets constructed by Ingrid Daubechies in 1988. They are characterised by N vanishing moments, meaning they are exactly zero for polynomial signals up to degree N-1. Higher N gives smoother wavelets and better frequency localisation but wider time support. db4 is widely used in image processing; db8–db20 appear in seismology and biomedical applications.
What is multiresolution analysis?
Multiresolution analysis (MRA), formalised by Stéphane Mallat and Yves Meyer in 1989, is a mathematical framework that underlies the DWT. It decomposes a function space L²(R) into a sequence of nested approximation spaces at different scales. Each level of the DWT produces a coarse approximation (scaling coefficients) and the detail that was removed (wavelet coefficients). The fast wavelet transform exploits this structure to compute the DWT in O(N) operations.
How are wavelets used in image compression?
In image compression, a 2D wavelet transform (applied separately along rows and columns) decomposes an image into subbands at multiple scales. Low-frequency subbands contain the coarse image content; high-frequency subbands contain edge and texture details. Since most image energy is in low-frequency subbands, the high-frequency coefficients can be heavily quantised or discarded with little perceptual impact. JPEG2000 uses the Daubechies 9/7 (lossy) and 5/3 (lossless) biorthogonal wavelets.
How is the wavelet transform used in medical imaging?
Wavelet transforms are used in medical imaging for denoising (separating signal from noise in MRI and CT scans), compression (JPEG2000 is the standard in DICOM medical imaging), and feature extraction (ECG beat detection, epileptic spike detection in EEG, microcalcification detection in mammography). The time-frequency localisation of wavelets makes them ideal for biomedical signals that are non-stationary.
What makes wavelets better than the short-time Fourier transform?
The Short-Time Fourier Transform (SFTF) uses a fixed-length window and therefore has the same time and frequency resolution at all scales. Wavelets use a variable window: wide at low frequencies (good frequency resolution, coarse time resolution) and narrow at high frequencies (good time resolution, coarse frequency resolution). This matches the way most natural signals behave — low-frequency events tend to be long and slowly changing, while high-frequency events are transient.