The central observation of statistical field theory is that the critical behaviour of many-body systems is governed not by microscopic details, but by symmetry and dimensionality. A magnet near its Curie point, a liquid near the liquid-gas critical point, and a percolating network all share the same set of critical exponents because they belong to the same universality class. The tool that makes this precise — the renormalisation group — is one of the most beautiful constructs in theoretical physics, and it now appears in unexpected form inside modern machine learning architectures.
1. From Statistical Mechanics to Field Theory
The Ising model is the canonical starting point. Originally proposed by Lenz (1920) and solved by Ising (1925) in 1D (no transition) and by Onsager (1944) in 2D (exact solution), the model distils the competition between ferromagnetic ordering and thermal disorder into the simplest possible Hamiltonian.
Ising Model, Landau-Ginzburg & Spontaneous Symmetry Breaking
Ising Hamiltonian:
H = −J Σ_{} σᵢσⱼ − h Σᵢ σᵢ
σᵢ ∈ {−1, +1}; J > 0 (ferromagnetic); h = external field
Z = Σ_{σ} exp(−βH) with β = 1/(k_B T)
Mean-field approximation:
Replace σⱼ → ⟨m⟩ (mean magnetisation), each site has z neighbours:
H_eff = −(zJ⟨m⟩ + h) Σᵢ σᵢ
Self-consistency: ⟨m⟩ = tanh[β(zJ⟨m⟩ + h)]
Critical temperature: k_B T_c = zJ
Near T_c: ⟨m⟩ ≈ ±√(3(T_c−T)/T_c) (h = 0, T < T_c)
Below T_c: Z₂ symmetry (σ → −σ) spontaneously broken
Landau-Ginzburg (LG) free energy functional:
F[φ] = ∫d^d x [½(∇φ)² + a(T)φ² + b φ⁴ + ...]
φ(x): continuous order parameter field (continuum limit of ⟨σᵢ⟩)
a(T) = a₀(T − T_c): changes sign at T_c
Saddle-point (mean-field): δF/δφ = 0 → −∇²φ + 2aφ + 4bφ³ = 0
Uniform solution: φ = 0 (T > T_c); φ = ±√(−a/2b) (T < T_c)
Mexican hat potential at T < T_c → degenerate minima
Beyond mean-field — Gaussian fluctuations:
F[φ] = ∫(d^d k)/(2π)^d [½k²|φ_k|² + a|φ_k|²] (Fourier, quadratic part)
Propagator: G(k) = ⟨φ_k φ_{-k}⟩ = 1/(k² + 2a)
Correlation length: ξ = (2a)^{−½} ∝ |T − T_c|^{−½} (mean-field: ν = ½)
2. Path Integrals in Field Theory
Feynman's path integral formulation replaces the sum over discrete microstates with an integral over all configurations of a field. For a scalar field φ(x) in d-dimensional Euclidean space, the partition function becomes Z = ∫Dφ exp(−S[φ]), where S is the Euclidean action. This formalism unifies quantum field theory and statistical mechanics through an analytic continuation: imaginary time τ = it maps quantum amplitudes to statistical Boltzmann weights.
Euclidean Path Integral & Quantum-Statistical Correspondence
Euclidean partition function:
Z = ∫Dφ exp(−S_E[φ]/ħ)
S_E[φ] = ∫d^d x [½(∂_μ φ)² + V(φ)] (positive definite → well-posed)
μ = 1..d; sum over Euclidean indices (no minus signs from Minkowski metric)
Quantum mechanics in imaginary time:
Minkowski path integral: ⟨x_f|e^{−iHt/ħ}|x_i⟩ = ∫Dx exp(iS_M/ħ)
Wick rotation t → −iτ:
e^{−iHt/ħ} → e^{−Hτ/ħ} (Boltzmann weight!)
Thermal partition function Z = Tr(e^{−βH}) = ∫Dx [periodic in τ ∈ (0,ħβ)]
Correspondence: inverse temperature β ↔ Euclidean time extent ħβ
Gaussian path integral (free field, V = ½m²φ²):
Z_0 = ∫Dφ exp(−½∫φ(−∇² + m²)φ)
= (det(−∇² + m²))^{−½}
= exp(−½ Tr ln(−∇² + m²))
Evaluated exactly: Z_0 = exp(−½ Σ_k ln(k² + m²))
Free energy: F_0 = ½ Σ_k ln(k² + m²) = ½ (L/2π)^d ∫d^d k ln(k² + m²)
Perturbation theory (φ⁴ theory):
S[φ] = ∫d^d x [½(∂φ)² + ½m²φ² + (λ/4!)φ⁴]
Expand exp(−Sₙₙ[φ]) in powers of λ → Feynman diagrams
4D φ⁴: renormalisable; counterterms δm², δZ, δλ absorb UV divergences
Upper critical dimension d_c = 4 (above d_c: mean-field exponents are exact)
3. The Renormalisation Group
Wilson’s renormalisation group (RG) is a framework for understanding how physical descriptions change with the scale at which we observe a system. The key insight is that near a critical point, the physics becomes scale-invariant — the correlation length diverges, and the system looks the same on all length scales. Fixed points of the RG flow correspond to such scale-invariant theories.
Wilson RG, Fixed Points & Scaling Operators
Kadanoff block-spin construction (d=2 Ising):
Divide lattice into blocks of L sites; replace block by single effective spin
Effective Hamiltonian H'(σ') at scale bL → repeat → RG flow in coupling space
Momentum-shell RG (Wilson 1971):
Mode decomposition: φ(k) = φ_< (|k|<Λ/b) + φ_> (Λ/b<|k|<Λ)
Step 1 — integrate out fast modes φ_>: ∫Dφ_> e^{−S[φ_<,φ_>]}
Step 2 — rescale: k → bk, φ_< → b^{d/2−1+η/2} φ (field rescaling)
Result: new effective action S'[φ'_<] with shifted couplings
RG flow equations (φ⁴ theory near d=4):
da/dl = 2a + c₁ λ (l = ln b)
dλ/dl = (4−d)λ − c₂ λ²
Wilson-Fisher fixed point: λ* = (4−d)/c₂ + O(ε²) (ε = 4−d)
d=3: ε=1 → Wilson-Fisher lies between Gaussian (λ=0) and non-trivial
Fixed points and scaling:
Gaussian: a*=0, λ*=0 (free field, mean-field exponents exact for d>4)
Wilson-Fisher: controls 3D Ising critical point
At fixed point H*: scaling operators O_i with dimensions Δᵢ
Relevant: Δᵢ < d → grows under RG → leaves fixed point
Irrelevant: Δᵢ > d → flows to zero → "wash out" microscopic details → Universality
Marginal: Δᵢ = d → flow determined by higher-order terms
Operators at 3D Ising Wilson-Fisher fixed point:
φ² (temperature deformation): relevant, Δ = 1/ν⁻¹ = 1/0.629 ≈ 1.587
φ (field deformation): relevant, Δ = 2 − β/ν ≈ 1.518
All Z₂-even operators with higher dimensions: irrelevant → universality
4. Critical Exponents and Universality Classes
Universality is the empirical observation that systems with different microscopic structure exhibit identical power-law behaviour as they approach a critical point. The set of critical exponents (α, β, γ, δ, ν, η) characterises a universality class, which depends only on the dimensionality of space and the symmetry of the order parameter.
Critical Exponents, Scaling Relations & Universality Classes
Definitions (t = (T−T_c)/T_c, h = external field):
ξ ~ |t|^{−ν} (correlation length)
C_h ~ |t|^{−α} (specific heat)
⟨m⟩ ~ |t|^β (t < 0) (order parameter)
χ = ∂m/∂h ~ |t|^{−γ} (susceptibility)
⟨m(h)⟩ ~ h^{1/δ} (t = 0) (equation of state)
G(r) ~ r^{−(d−2+η)} exp(−r/ξ) (correlation function)
Scaling relations (follow from single diverging length scale ξ):
α + 2β + γ = 2 (Rushbrooke, follows from scaling)
γ = ν(2−η) (Fisher; ν and η are independent)
dν = 2−α (hyperscaling; holds for d ≤ d_c)
δ = (d+2−η)/(d−2+η) (Widom + Fisher)
Only 2 independent exponents needed to determine all 6
Exponent values:
Universality class d (ν, β, γ, α, η )
Mean-field any (0.50, 0.50,1.00, 0, 0 ) [d > 4]
2D Ising 2 (1.00, 0.125,1.75,0, 0.25) [exact]
3D Ising 3 (0.629,0.326,1.237,0.110,0.036) [conformal bootstrap]
3D Heisenberg 3 (0.711,0.366,1.397,−0.133,0.035)
3D XY (λ-point) 3 (0.671,0.348,1.316,−0.014,0.038)
Examples of equivalent critical systems:
3D Ising class: uniaxial magnets, liquid-gas critical point, binary alloys
3D XY class: superfluid He⁴ (λ-point at 2.17 K), superconductors
Percolation: own class; ν=0.876, β=0.417 (3D)
2D exact solution (Onsager 1944):
k_B T_c = 2J/ln(1+√2) ≈ 2.269 J
C ∝ −ln|t| (logarithmic divergence → α = 0 exactly)
⟨m⟩ = (1 − sinh^{−4}(2βJ))^{1/8} → β = 1/8
5. Conformal Field Theory in Two Dimensions
At a critical point, rotational and translational symmetry enhance to the full conformal group, which includes local angle-preserving transformations. In two dimensions the conformal group is infinite-dimensional, making 2D CFT a nearly exactly solvable theory. This power was exploited by Belavin, Polyakov, and Zamolodchikov (BPZ) in 1984 to classify 2D critical models by their central charge c and operator content.
Conformal Group, Virasoro Algebra & Minimal Models
Conformal transformations (in R^d):
Preserve angles: g_μν(x) → Ω(x) g_μν(x)
d ≥ 3: finite-dimensional group SO(d+1,1) with (d+2)(d+1)/2 generators
d = 2: local conformal maps = analytic functions z → f(z) on ℂ
Infinite-dimensional; generators L_n, n ∈ ℤ
Virasoro algebra (2D CFT):
[L_m, L_n] = (m−n)L_{m+n} + (c/12) m(m²−1) δ_{m+n,0}
c = central charge: characterises the CFT
L_{−1}, L_0, L_1: generate global SL(2,ℂ) subgroup
L_0: dilation generator → eigenvalue h = conformal weight of operator
Full Virasoro highest-weight state |h⟩: L_0|h⟩ = h|h⟩, L_n|h⟩ = 0 (n > 0)
Primary operators and OPE:
T(z)O(w,w̄) = (h/(z−w)²)O(w,w̄) + (1/(z−w))∂O(w,w̄) + regular
Operator product expansion (OPE):
Oᵢ(z)Oⱼ(0) = Σ_k C_{ij}^k |z|^{2(Δ_k−Δᵢ−Δⱼ)} O_k(0)
C_{ij}^k = OPE coefficients (determine all n-point functions)
Minimal models M(p,q) — unitary for p>q>1, gcd(p,q)=1:
c = 1 − 6(p−q)²/(pq)
Primary operator dimensions:
h_{r,s} = [(pr−qs)² − (p−q)²] / (4pq) (1 ≤ r ≤ q−1, 1 ≤ s ≤ p−1)
Physically realised minimal models:
M(3,4): c=1/2, Ising model (h=0, 1/16, 1/2 for I, σ, ε operators)
M(4,5): c=7/10, Tricritical Ising (magnetisation, tricritical)
M(5,6): c=6/7, Tricritical 3-state Potts model
State-operator correspondence:
Every operator O in CFT corresponds to a state |O⟩ in the Hilbert space on S^1
Partition function on torus: Z = Tr q^{L_0−c/24} q̄^{L̄_0−c/24} (q = e^{2πiτ})
Modular invariance → constraints on spectrum (Verlinde formula for fusion)
6. Connections to Machine Learning
In the last decade, researchers have found that many architectures in deep learning are, at their mathematical core, instances of statistical mechanics systems. The Boltzmann machine is an Ising model at temperature T. Diffusion generative models are discrete stochastic Langevin equations run in reverse. Score-based models solve Anderson’s time-reversed SDE. These correspondences are not merely aesthetic — they yield practical algorithms via physics-derived intuitions like contrastive divergence, annealing, and score matching.
Boltzmann Machines, Diffusion Models & Energy-Based Learning
Boltzmann machine:
Energy: E(v, h) = −Σ_{ij} W_{ij} vᵢhⱼ − Σᵢ bᵢvᵢ − Σⱼ cⱼhⱼ
Partition function: Z = Σ_{v,h} exp(−E/T) → exactly Ising with hidden units
Learning goal: maximise log P(v) = log Σ_h exp(−E/T) − log Z
∇_θ log P(v) = −⟨∂E/∂θ⟩_{data} + ⟨∂E/∂θ⟩_{model}
Contrastive divergence (Hinton 2002): approximate model expectation via k Gibbs steps
Restricted Boltzmann Machine (RBM):
No v-v or h-h connections → tractable:
P(v) = Σ_h e^{−E(v,h)}/Z = ∏ⱼ 2cosh(cⱼ + Σᵢ W_{ij}vᵢ) × e^{bᵢvᵢ} / Z_v
Hidden units h marginalised analytically
Connection to PCA: in linear limit W^T W → singular value decomposition
Score-based diffusion models (Song & Ermon 2019):
Forward process: q(xₜ|x_{t-1}) = N(xₜ; √(1−β_t)x_{t-1}, β_t I) [noise addition]
Reverse process: p_θ(x_{t-1}|xₜ) = N(x_{t-1}; μ_θ(xₜ,t), Σₜ) [neural denoiser]
Score network: s_θ(xₜ, t) ≈ ∇_{xₜ} log q(xₜ) [Stein score]
Physics: forward = Ornstein-Uhlenbeck; reverse = time-reversed SDE (Anderson 1982)
Connection to Langevin MCMC: dx = −∇V(x)dt + √(2/β)dW (target ∝ e^{−βV})
Neural tangent kernel (NTK) and Gaussian processes:
Wide neural network (width → ∞): kernel K_{NTK}(x,x') = ∇θ f(x)·∇θ f(x')
In this limit: network = Gaussian process with kernel K_NTK
Connects to stat-mech: mean-field theory of infinite-width networks
RG perspective: network depth = RG flow; hidden layer activations = block spins
(Mehta & Schwab 2014: deep learning ↔ variational renormalisation group)
Replica method and generalisation:
Spin glass theory (Parisi RSB): replica partition function
Z^n = Σ_σ exp(−βH(σ^1)−...−βH(σ^n)); then ∂Z^n/∂n|_{n→0} = ⟨log Z⟩
Applied to perceptron learning: Gardner-Derrida theory 1988
Modern revival: analysis of SAT-UNSAT transition in neural network loss landscape,
random matrix theory for singular value spectra of weight matrices
Thirty posts in: from Newtonian mechanics and orbital dynamics in Learning #1, through quantum mechanics, electrodynamics, special and general relativity, chaos theory, statistical mechanics, and now statistical field theory — the Learning series has built a continuous thread from classical physics to the frontier where theoretical physics and deep learning research overlap.
Try These Simulations
Ising Model
2D Ising model on a square lattice: Metropolis Monte Carlo with tunable temperature, real-time order parameter, specific heat and susceptibility diverging near T_c.
Phase Transitions
Interactive first- and second-order phase transitions: Landau free energy landscape, spinodal decomposition, interfacial tension and nucleation rate.
Cellular Automata
Elementary 1D rules (Rule 30/110/184), Conway’s Game of Life, and Wolfram complexity classes visualised in real time on configurable grids.
Random Walk
1D/2D/3D random walks and Brownian motion: diffusion coefficient, probability distribution spreading, return probability, and Wiener process path ensemble.