ReSTIR: Reservoir-based Spatiotemporal Importance Resampling

1. The Many-Light Problem

The rendering equation requires integrating incoming radiance over all directions at each surface point:

L_o(x, ω_o) = L_e(x, ω_o) + \int_Ω f_r(x, ω_i, ω_o) \cdot L_i(x, ω_i) \cdot |cos θᵢ| dω_i

In scenes with thousands of dynamic emissive meshes (area lights, emissive particles, car headlights, windows in a city night scene), naively sampling one light uniformly from the entire set produces variance proportional to N_lights. For N = 10,000 lights and 1 sample per pixel, almost every sample will land on a light that contributes negligible illumination to the current pixel — producing extremely noisy results.

The ideal would be to sample lights in proportion to their contribution p̂(x) = f_r · L_e · |cos θ| / r² — but computing this for every light × every pixel is O(N_pixels × N_lights) per frame, far beyond real-time budget.

Prior Art: Light Trees & Virtual Point Lights

Light trees (Lightcuts, 2005): hierarchical clustering of lights; traverse tree per pixel picking O(log N) representatives. Works for static scenes but has O(log N) per-pixel cost.
Virtual point lights (VPL/Instant Radiosity): RSM-generated secondary lights. Does not handle glossy surfaces; clamping introduces bias.
Reservoirs (ReSTIR, 2020): O(1) per-pixel cost with effective sample count growing quadratically via neighbour reuse.

2. Weighted Reservoir Sampling

Weighted Reservoir Sampling (WRS) (Vitter, 1985) produces a single sample from a stream of candidates where each item xᵢ has weight wᵢ — without needing to store all items:

Algorithm WRS (stream x₁, w₁), (x₂, w₂), ..., (xₙ, wₙ): reservoir.y = null reservoir.w_sum = 0 reservoir.M = 0 // number of candidates seen for each (xᵢ, wᵢ): reservoir.w_sum += wᵢ reservoir.M += 1 // Replace current sample with probability wᵢ / w_sum: if rand() < wᵢ / reservoir.w_sum: reservoir.y = xᵢ // Result: reservoir.y is distributed proportional to weights wᵢ // Unbiased estimator of target integral \propto f̂(y) = f(y)/p(y) \cdot w_sum/M

The key property: after processing n candidates, the reservoir holds exactly 1 sample, and P(y = xᵢ) = wᵢ / Σwⱼ. The reservoir can be updated incrementally — new candidates can be merged in O(1). This is what makes ReSTIR GPU-friendly: each pixel maintains one reservoir in a texture.

Merging Two Reservoirs

Two reservoirs R₁ and R₂ (each having processed M₁ and M₂ candidates) can be merged into a single unbiased reservoir:

Merge(R₁, R₂) \to R_out: R_out = new reservoir Update(R_out, R₁.y, R₁.w_sum) // treat R₁'s sample as one candidate Update(R_out, R₂.y, R₂.w_sum) // with weight equal to its accumulated w_sum R_out.M = R₁.M + R₂.M // R_out.y is a valid sample distributed \propto w_sum of combined stream

This merge operation is the foundation of spatial reuse: neighbouring pixels' reservoirs are merged into the current pixel's reservoir, effectively augmenting its candidate set without re-evaluating any BRDFs.

3. Resampled Importance Sampling (RIS)

Resampled Importance Sampling (RIS) connects WRS to Monte Carlo rendering. We want to estimate:

I = \int f(x) dx = \int p̂(x) / p(x) \cdot p(x) dx // where p̂(x) \propto f_r(x) \cdot L_e(x) \cdot G(x) (the ideal target PDF) // p(x) = uniform light selection (the source PDF)

RIS generates M candidates {xᵢ} from source pdf p(x), assigns weight wᵢ = p̂(xᵢ)/p(xᵢ), runs WRS to select one sample z, then estimates I as:

I \approx (1/M) \cdot (f(z) / p̂(z)) \cdot Σᵢ wᵢ = f(z)/p̂(z) \cdot W_sum/M // This is an unbiased estimator of I when: // 1. p̂(x) > 0 wherever f(x) > 0 // 2. z is selected with probability proportional to wᵢ (i.e., via WRS)

Critically, the target PDF p̂ can approximate the ideal importance proportional to f but need not be exactly p ∝ f. Even a rough approximation (e.g. p̂ = L_e/r² ignoring BRDF) dramatically reduces variance compared to uniform sampling. The BRDF is evaluated only once for the selected candidate z, not for all M candidates — keeping the per-pixel cost O(M_initial + 1 BRDF eval).

4. Temporal Reuse

In a real-time renderer, frames are correlated: the scene changes little between frame n and frame n−1. ReSTIR exploits this by storing one reservoir per pixel and reusing it across frames:

Temporal reuse algorithm (per pixel q): 1. Generate M_new candidates for pixel q, build reservoir R_new 2. Find corresponding pixel q' in previous frame (via reprojection: project q's world position using prev frame's view matrix) 3. Retrieve R_prev from the temporal reservoir buffer at q' 4. Merge: R_temporal = Merge(R_new, min(R_prev, 20\cdotM_new) candidates) // Cap R_prev.M at 20\timesM_new to prevent temporal lag when scene changes 5. Store R_temporal back into the temporal buffer 6. Evaluate the selected sample from R_temporal for this frame's pixel colour

With M_new = 32 initial candidates and temporal accumulation capped at M = 32 × 20 = 640 effective candidates, each pixel has the sampling benefit of 640 candidates at the cost of generating only 32 — a 20× variance reduction for free.

Reprojection & Disocclusion

Reprojection fails when: (1) the geometry under a pixel changes (disocclusion), (2) the surface normal or material changes (geometry instability), or (3) a bright light suddenly appears (temporal lag). All three are detected by:

Depth test: |depth_prev − depth_curr| / depth_curr > threshold → reset history
Normal test: cos(n_prev · n_curr) < 0.9 → reset history
Motion vectors from the G-buffer for accurate subpixel reprojection

5. Spatial Reuse & Bias

After temporal reuse, each pixel's reservoir is further merged with k randomly chosen neighbours (k = 5 is typical):

Spatial reuse (per pixel q): R_q = temporal reservoir of q for j = 1..k: q_j = random neighbour within radius r (typically r = 30 pixels) if similar_surface(q, q_j): // depth + normal check R_q = Merge(R_q, R_q_j) Output: R_q.y evaluated for pixel q's shading

Spatial reuse effectively multiplies the sample budget by k+1 — but it introduces bias unless corrected. The bias arises because a neighbour pixel q_j may have a different visible surface than q: the light sample y chosen at q_j may be occluded or geometrically inconsistent at q's surface point. Using R_q_j's w_sum (computed at q_j's surface) as if it were valid at q overestimates contribution.

Informal bias intuition: Suppose q is in a shadow and q_j is in direct light. q_j selected a bright light sample with high weight. Merging that reservoir into q and using q_j's high w_sum as the MIS weight incorrectly assigns high probability to a sample that actually contributes zero illumination to q. Without correction, the estimator is positively biased — the image appears too bright near shadow boundaries.

6. Bias Correction & MIS Weights

The original ReSTIR DI paper (Bitterli et al., SIGGRAPH 2020) proposed two bias correction approaches:

Method 1: Biased (Approximate)

Accept the bias and observe that it mostly manifests near shadow boundaries. For many applications (games, real-time previsualization) the bias is visually acceptable and yields the simplest implementation.

Method 2: MIS-based Unbiased Correction

Replace each reservoir's w_sum with an MIS-weighted version that accounts for the probability that the selected sample would have been chosen by all participating reservoirs:

// Unbiased combination using MIS weights (Talbot et al., 2005 / Bitterli 2020) // When merging k+1 reservoirs {R₀, R₁, ..., Rₖ} from pixels {q₀, ..., qₖ}: W_q = (1 / p̂_q(y)) · (1 / Z) where Z is chosen so the estimator is unbiased: Z = Σⱼ mⱼ · p̂_qⱼ(y) // sum of target PDF evaluations at all neighbours mⱼ = Rⱼ.M if y is a "valid" sample at pixel qⱼ (visibility check + surface test) = 0 otherwise // One shadow ray per merge is needed to check visibility at each neighbour — // which is O(k) rays per pixel, acceptable for k ≤ 5

An alternative unbiased approach (Wyman & Panteleev, 2021 — "Rearchitecting Spatiotemporal Resampling for Production") avoids extra shadow rays by using only a geometric similarity test instead of full visibility. This introduces mild bias at occlusion boundaries but is far faster.

Confidence Weights

A practical complication: after temporal accumulation, R_prev.M can be very large (thousands), dominating the spatial merge. Bitterli et al. recommend capping M to prevent a single reservoir from dominating all neighbours:

R_temporal.M = min(R_temporal.M, C \cdot M_new) C = 20 is the standard cap (20\times the initial candidate count)

7. ReSTIR DI, GI & PT

The ReSTIR framework has been extended from direct illumination to full global illumination:

ReSTIR DI (Direct Illumination)

The original 2020 paper. Samples: one area light per reservoir. Target PDF: p̂ = L_e · G · |cos θ| / r² (unshadowed). Shadow ray evaluated only for the final selected candidate. Achieves equivalent quality to 128 light candidates per pixel at the cost of 1 shadow ray per pixel with spatial+temporal reuse.

Shipped in NVIDIA's RTX Direct Illumination SDK (RTXDI) and widely used in games and offline previsualization tools since 2021.

ReSTIR GI (Global Illumination)

Ouyang et al. (SIGGRAPH 2021). Extends ReSTIR to path segments: rather than sampling a direct light, each reservoir stores a complete first-bounce path (x_v → x_s, where x_s is the secondary hit point with its radiance L_i(x_s)). The target PDF is p̂(x_s) = L_i(x_s) · f_r(x_v → x_s) · |cos θ|.

Reservoir sample y = (x_s, L_i, n_s) // secondary hit + incoming radiance + normal Target PDF: p̂(y | x_v) = ||L_i|| \cdot f_r(ω_i, ω_o) \cdot |cos θᵢ| // Spatial reuse: neighbour q_j evaluates p̂(y | x_v_j) — does the shared // secondary point x_s contribute well from q_j's surface? // If geometry test passes, merge the reservoir.

ReSTIR GI achieves 1-bounce GI quality comparable to 16 spp in a single bounce at real-time rates. It powers the "Lumen Hardware Lumen" mode in Unreal Engine 5.2+.

ReSTIR PT (Path Tracing)

Lin et al., SIGGRAPH 2022. Extends to full paths of arbitrary depth. Each reservoir stores an entire light path as a sequence of vertices {x₁, x₂, ..., xₖ}. The reconnection shift mapping (inspired by MCMC path tracing) allows neighbouring pixels to share path prefixes while correctly shifting the path suffix to the new origin.

ReSTIR PT achieves coherent global illumination (caustics, interreflections, specular chains) at 1–4 spp equivalent, enabling offline-quality path tracing on consumer GPUs at near-real-time rates (1–10 fps for production scenes).

Variant	Year	Reservoir sample	Effective spp equiv.	Key cost
ReSTIR DI	2020	1 area light candidate	128–512 direct light spp	1 shadow ray/pixel
ReSTIR GI	2021	1 secondary path hit	8–32 bounce-1 GI spp	1 secondary ray/pixel
ReSTIR PT	2022	Full k-bounce path	4–16 full path spp	1 primary + k reconnect rays

8. WebGPU Implementation Notes

A minimal ReSTIR DI implementation in WebGPU requires the following render passes per frame:

Pipeline (per frame): Pass 1: G-Buffer (rasterise \to albedo, normal, depth, motion vector textures) Pass 2: Initial candidates (compute shader, 1 workgroup per pixel) \to Sample M_init lights from uniform distribution \to Build reservoir per pixel using WRS \to Write to reservoirs_initial[] Pass 3: Temporal reuse (compute shader) \to Reproject using motion vectors \to Merge reservoirs_initial with reservoirs_prev_temporal[] \to Write to reservoirs_temporal[] (also becomes next frame's prev) Pass 4: Spatial reuse (compute shader, possibly 2+ passes) \to For each pixel, sample k neighbours \to Merge after geometry validation \to Write to reservoirs_spatial[] Pass 5: Shade (compute or fragment shader) \to For each pixel, evaluate selected reservoir sample y: colour = f_r(y) \cdot L_e(y) \cdot G(y) \cdot reservoir.W_sum / M \to Write to output texture Pass 6: Denoise + TAA (optional)

WGSL Reservoir Structure

// WGSL reservoir buffer entry
struct Reservoir {
  y_light_index : u32,    // selected light candidate index
  y_uv          : vec2f,  // UV on the light surface (for area lights)
  w_sum         : f32,    // accumulated sum of weights
  M             : u32,    // number of candidates seen
  W             : f32,    // unbiased weight: w_sum / (M · p̂(y))
};

@group(0) @binding(0) var<storage, read_write> reservoirs : array<Reservoir>;
@group(0) @binding(1) var<storage, read>       lights      : array<Light>;
@group(0) @binding(2) var                        g_depth     : texture_2d<f32>;
@group(0) @binding(3) var                        g_normal    : texture_2d<f32>;

// WRS update (call for each candidate xᵢ with weight wᵢ)
fn wrs_update(r: ptr<function, Reservoir>, x: u32, w: f32, rng: f32) {
  (*r).w_sum += w;
  (*r).M     += 1u;
  if rng < w / (*r).w_sum {
    (*r).y_light_index = x;
  }
}

// Merge two reservoirs (for spatial/temporal combination)
fn merge(dst: ptr<function, Reservoir>, src: Reservoir,
         p_hat_src: f32, rng: f32) {
  let w = p_hat_src * src.W * f32(src.M);
  wrs_update(dst, src.y_light_index, w, rng);
  (*dst).M += src.M;
}

// Compute shader: initial candidate generation
@compute @workgroup_size(8, 8)
fn generate_initial(@builtin(global_invocation_id) gid: vec3u) {
  let px = gid.xy;
  let res_idx = px.y * u32(screen_size.x) + px.x;

  var r : Reservoir;
  r.w_sum = 0.0; r.M = 0u; r.W = 0.0;

  // Read G-buffer surface at this pixel
  let depth  = textureLoad(g_depth,  px, 0).r;
  let normal = textureLoad(g_normal, px, 0).xyz * 2.0 - 1.0;
  let world_pos = reconstruct_world(px, depth);

  var rng = pcg_hash(res_idx ^ (frame_index * 1013904223u));

  for (var i = 0u; i < INITIAL_CANDIDATES; i++) {
    let light_idx = rand_u32(&rng) % num_lights;
    let light = lights[light_idx];
    // Unshadowed target PDF: L_e · |cos θ| / r²
    let to_light = light.position - world_pos;
    let dist2    = dot(to_light, to_light);
    let n_dot_l  = max(0.0, dot(normal, normalize(to_light)));
    let p_hat    = length(light.emission) * n_dot_l / dist2;
    let w        = p_hat / (1.0 / f32(num_lights));   // weight = p̂ / source_pdf
    wrs_update(&r, light_idx, w, fract(rand_f32(&rng)));
  }

  // Compute unbiased weight W = w_sum / (M · p̂(y))
  let sel = lights[r.y_light_index];
  let to_sel  = sel.position - world_pos;
  let dist2s  = dot(to_sel, to_sel);
  let p_hat_y = length(sel.emission) * max(0.0, dot(normal, normalize(to_sel))) / dist2s;
  r.W = select(0.0, r.w_sum / (f32(r.M) * p_hat_y), p_hat_y > 0.0);

  reservoirs[res_idx] = r;
}

Production status: ReSTIR DI ships in NVIDIA RTXDI (open-source, GitHub), Unreal Engine 5 (Hardware Lumen uses ReSTIR GI), and is the core sampling algorithm in games like Cyberpunk 2077 Overdrive Mode and Alan Wake 2. The original SIGGRAPH 2020 paper is freely available on NVIDIA Research's website and is one of the most-cited real-time rendering papers of the decade.