Path Tracing in WebGL (GLSL): A Real-Time Ray Tracer From Scratch

Rasterization approximates light transport with tricks — shadow maps, cubemaps, screen-space reflections. Path tracing instead simulates light transport directly by solving the rendering equation with Monte Carlo integration. This tutorial builds a physically-based path tracer that runs entirely in a single GLSL fragment shader, accumulating samples in real time in the browser.

1. The Rendering Equation

Every physically-based renderer is, at its core, an attempt to solve Kajiya's rendering equation. It states that the outgoing radiance L_o from a point x in direction ω_o equals emitted radiance plus the integral of incoming radiance weighted by the surface's BRDF and a cosine term:

L_o(x, ω_o) = L_e(x, ω_o) + \int_Ω f_r(x, ω_i, ω_o) \cdot L_i(x, ω_i) \cdot (ω_i \cdot n) dω_i L_e = emitted radiance (light sources) f_r = BRDF (bidirectional reflectance distribution function) L_i = incoming radiance from direction ω_i Ω = hemisphere above the surface normal n (ω_i \cdot n) = Lambert's cosine term (foreshortening)

This integral has no closed-form solution for arbitrary scenes — L_i itself depends recursively on the same equation evaluated at whatever surface the ray from x hits next. We approximate it with Monte Carlo integration: trace many random light paths, average the results, and let the noise cancel out over time. That averaging is exactly what a real-time path tracer does frame after frame.

Why GLSL? A fragment shader runs once per pixel, in parallel, on the GPU. That maps almost perfectly onto path tracing: each pixel independently traces its own ray path through the scene and writes one color sample. No BVH-on-CPU, no readback — the whole loop lives on the GPU.

2. Full-Screen Quad and Accumulation Buffer

The path tracer itself is a fragment shader painted onto a full-screen triangle. Because a single frame at 1 sample per pixel is extremely noisy, we render into a floating-point texture and accumulate samples across frames, then divide by the frame count when displaying:

const gl = canvas.getContext("webgl2");
gl.getExtension("EXT_color_buffer_float"); // needed for RGBA32F render targets

function createAccumTarget(w, h) {
  const tex = gl.createTexture();
  gl.bindTexture(gl.TEXTURE_2D, tex);
  gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA32F, w, h, 0, gl.RGBA, gl.FLOAT, null);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);

  const fbo = gl.createFramebuffer();
  gl.bindFramebuffer(gl.FRAMEBUFFER, fbo);
  gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, tex, 0);
  return { tex, fbo };
}

// Ping-pong: read previous accumulation while writing the new one
let targets = [createAccumTarget(width, height), createAccumTarget(width, height)];
let frameIndex = 0;

function render() {
  const src = targets[frameIndex % 2];
  const dst = targets[(frameIndex + 1) % 2];
  gl.bindFramebuffer(gl.FRAMEBUFFER, dst.fbo);
  gl.useProgram(pathTraceProgram);
  gl.uniform1i(uPrevSample, 0); // bound to src.tex
  gl.uniform1i(uFrame, frameIndex);
  gl.drawArrays(gl.TRIANGLES, 0, 3); // full-screen triangle, no VBO needed

  // Second pass: divide by frameIndex+1 and tonemap to the visible canvas
  gl.bindFramebuffer(gl.FRAMEBUFFER, null);
  gl.useProgram(displayProgram);
  gl.drawArrays(gl.TRIANGLES, 0, 3);

  frameIndex++;
  requestAnimationFrame(render);
}

Full-screen triangle trick: Instead of a quad (2 triangles, 4 vertices), draw a single oversized triangle covering the viewport using gl_VertexID in the vertex shader — no vertex buffer required, and it avoids the diagonal seam artifact some GPUs show on quads.

3. Camera Ray Generation

Each fragment corresponds to one pixel. We map its screen-space coordinate to a normalized device coordinate in [-1, 1], then unproject it into a ray direction using the camera's field of view and aspect ratio:

struct Ray { vec3 origin; vec3 dir; };

Ray getCameraRay(vec2 uv, vec3 camPos, vec3 camTarget, float fovY) {
  vec2 ndc = uv * 2.0 - 1.0;               // [0,1] -> [-1,1]
  ndc.x *= uResolution.x / uResolution.y;    // correct for aspect ratio

  vec3 fwd   = normalize(camTarget - camPos);
  vec3 right = normalize(cross(fwd, vec3(0.0, 1.0, 0.0)));
  vec3 up    = cross(right, fwd);

  float tanHalfFov = tan(radians(fovY) * 0.5);
  vec3 dir = normalize(fwd + ndc.x * tanHalfFov * right + ndc.y * tanHalfFov * up);
  return Ray(camPos, dir);
}

For anti-aliasing and to feed the Monte Carlo estimator, jitter the pixel coordinate by a random subpixel offset (rand() - 0.5) before converting to NDC — each accumulated frame samples a slightly different point inside the pixel footprint, so over hundreds of frames the edges naturally anti-alias for free.

4. Scene Representation: Analytic Spheres

A minimal path tracer scene is a list of spheres (and a ground plane) with material properties. Ray-sphere intersection has a closed-form solution from the quadratic formula:

|O + tD - C|² = r² expands to: t² (D\cdotD) + 2t D\cdot(O-C) + (O-C)\cdot(O-C) - r² = 0 a = 1 (D is normalized) b = 2 \cdot D\cdot(O-C) c = (O-C)\cdot(O-C) - r² discriminant = b² - 4ac t = (-b - \sqrtdiscriminant) / 2 (nearest root)

struct Sphere { vec3 center; float radius; vec3 albedo; float roughness; vec3 emissive; };

bool intersectSphere(Ray ray, Sphere s, out float t, out vec3 n) {
  vec3 oc = ray.origin - s.center;
  float b = dot(oc, ray.dir);
  float c = dot(oc, oc) - s.radius * s.radius;
  float disc = b * b - c;
  if (disc < 0.0) return false;
  float sq = sqrt(disc);
  t = -b - sq;
  if (t < 0.001) t = -b + sq; // try far root if near one is behind origin
  if (t < 0.001) return false;
  n = normalize(ray.origin + t * ray.dir - s.center);
  return true;
}

// Nearest-hit scene traversal: loop every sphere, keep the closest t
bool intersectScene(Ray ray, out float tHit, out vec3 nHit, out Sphere hit) {
  tHit = 1e9;
  bool found = false;
  for (int i = 0; i < NUM_SPHERES; i++) {
    float t; vec3 n;
    if (intersectSphere(ray, spheres[i], t, n) && t < tHit) {
      tHit = t; nHit = n; hit = spheres[i]; found = true;
    }
  }
  return found;
}

5. Cosine-Weighted Hemisphere Sampling

For diffuse (Lambertian) surfaces, the reflected radiance is proportional to cos θ between the outgoing direction and the normal. If we sample bounce directions uniformly over the hemisphere, most samples land near the horizon where they contribute little — wasted work. Instead we importance sample proportionally to cos θ, which cancels the cosine term in the estimator entirely:

// PRNG: cheap, deterministic per-pixel hash-based random (PCG-style)
float rand(inout uint seed) {
  seed = seed * 747796405u + 2891336453u;
  uint word = ((seed >> ((seed >> 28u) + 4u)) ^ seed) * 277803737u;
  return float((word >> 22u) ^ word) / 4294967295.0;
}

// Maps two uniform randoms to a cosine-weighted point on the hemisphere
// around normal n, via Malley's method: uniform disk + project up
vec3 cosineSampleHemisphere(vec3 n, inout uint seed) {
  float u1 = rand(seed);
  float u2 = rand(seed);
  float r     = sqrt(u1);
  float theta = 6.2831853 * u2; // 2*PI
  float x = r * cos(theta);
  float y = r * sin(theta);
  float z = sqrt(max(0.0, 1.0 - u1)); // height above the tangent plane

  // Build an orthonormal basis (tangent, bitangent) around n
  vec3 up = abs(n.z) < 0.999 ? vec3(0.0,0.0,1.0) : vec3(1.0,0.0,0.0);
  vec3 tangent   = normalize(cross(up, n));
  vec3 bitangent = cross(n, tangent);
  return normalize(x * tangent + y * bitangent + z * n);
}

Cosine-weighted PDF: p(ω) = cos θ / π Lambertian BRDF: f_r = albedo / π Estimator per bounce simplifies to: throughput *= f_r \cdot cos θ / p(ω) = (albedo/π) \cdot cosθ / (cosθ/π) = albedo \to no cosine or π term ever appears in the running code — they cancel analytically when the sampling matches the BRDF shape.

6. The Bounce Loop and Russian Roulette

A path tracer walks the ray through multiple bounces, multiplying a running throughput term by each surface's albedo and adding any emissive light encountered along the way. GLSL (pre-ES 3.1) forbids true recursion, so the bounce chain is an explicit loop with a fixed maximum depth:

vec3 tracePath(Ray ray, inout uint seed) {
  vec3 radiance   = vec3(0.0);
  vec3 throughput = vec3(1.0);

  for (int bounce = 0; bounce < MAX_BOUNCES; bounce++) {
    float t; vec3 n; Sphere hit;
    if (!intersectScene(ray, t, n, hit)) {
      radiance += throughput * skyColor(ray.dir); // escaped to environment
      break;
    }

    radiance += throughput * hit.emissive;     // hit a light source
    throughput *= hit.albedo;              // diffuse BRDF, cosine already cancelled

    vec3 hitPoint = ray.origin + t * ray.dir;
    vec3 newDir   = cosineSampleHemisphere(n, seed);
    ray = Ray(hitPoint + n * 0.001, newDir); // epsilon offset avoids self-intersection

    // Russian roulette: stochastically terminate low-contribution paths
    if (bounce > 3) {
      float p = max(throughput.r, max(throughput.g, throughput.b));
      if (rand(seed) > p) break;
      throughput /= p; // unbiased: rescale surviving paths
    }
  }
  return radiance;
}

Russian roulette keeps the estimator unbiased while capping average cost: a path with throughput 0.1 survives with 10% probability but its contribution is divided by 0.1 (×10) if it does — the expected value is unchanged, but 90% of the time we save the work of tracing it further.

7. Progressive Accumulation and Denoising

A single sample per pixel is far too noisy to look acceptable — each pixel is an independent random variable with high variance. Because Monte Carlo error decreases as 1/√N for N samples, running the same shader every frame and averaging results converges toward the correct image:

out vec4 fragColor;

void main() {
  uint seed = uint(gl_FragCoord.x) * 1973u + uint(gl_FragCoord.y) * 9277u + uint(uFrame) * 26699u | 1u;

  vec2 jitter = vec2(rand(seed), rand(seed)) - 0.5;
  vec2 uv = (gl_FragCoord.xy + jitter) / uResolution;

  Ray ray = getCameraRay(uv, uCamPos, uCamTarget, 45.0);
  vec3 sample = tracePath(ray, seed);

  vec3 prev = texelFetch(uPrevSample, ivec2(gl_FragCoord.xy), 0).rgb;
  vec3 accumulated = prev * float(uFrame) + sample; // running sum, not running average
  fragColor = vec4(accumulated, 1.0);
}

// Display pass divides by (uFrame + 1) and applies Reinhard tonemap + gamma:
// vec3 color = accumulated / float(uFrame + 1);
// color = color / (1.0 + color);              // Reinhard
// fragColor = vec4(pow(color, vec3(1.0/2.2)), 1.0); // gamma correction

Any camera movement invalidates the accumulation — reset frameIndex to 0 whenever the camera or scene changes, then let it climb back up while the view is static. At 200+ accumulated frames a simple scene of a few spheres looks essentially noise-free at 1080p on a mid-range GPU.

Beyond brute accumulation: Production real-time path tracers (and this one, if extended) pair a few samples per pixel with an SVGF-style spatio-temporal denoiser that reuses geometry buffers (normal, depth, albedo) to smooth noise without blurring edges — accumulation alone is the simplest and most robust starting point.

8. The Complete Fragment Shader

Putting every piece together — scene, camera rays, cosine sampling, bounce loop, and accumulation — gives a self-contained GLSL ES 3.0 fragment shader. Uniforms for sphere data, camera, and frame index are set from JavaScript each frame:

#version 300 es
precision highp float;

uniform vec2  uResolution;
uniform vec3  uCamPos;
uniform vec3  uCamTarget;
uniform int   uFrame;
uniform sampler2D uPrevSample;

out vec4 fragColor;

#define NUM_SPHERES 4
#define MAX_BOUNCES 6

struct Ray    { vec3 origin; vec3 dir; };
struct Sphere { vec3 center; float radius; vec3 albedo; vec3 emissive; };

Sphere spheres[NUM_SPHERES] = Sphere[](
  Sphere(vec3(0.0, -100.5, 0.0), 100.0, vec3(0.6), vec3(0.0)),  // ground
  Sphere(vec3(-1.0, 0.0, 0.0),   0.5,   vec3(0.9,0.2,0.2), vec3(0.0)),
  Sphere(vec3( 1.0, 0.0, 0.0),   0.5,   vec3(0.2,0.5,0.9), vec3(0.0)),
  Sphere(vec3( 0.0, 5.0, 0.0),   1.5,   vec3(1.0),         vec3(4.0,3.8,3.5)) // area light
);

// hash / rand / cosineSampleHemisphere / intersectSphere / intersectScene
// / getCameraRay / tracePath — as defined in sections 3-6 above

vec3 skyColor(vec3 dir) {
  float t = 0.5 * (dir.y + 1.0);
  return mix(vec3(1.0), vec3(0.5,0.7,1.0), t) * 0.3; // dim gradient, light does the work
}

void main() {
  uint seed = uint(gl_FragCoord.x) * 1973u + uint(gl_FragCoord.y) * 9277u + uint(uFrame) * 26699u | 1u;
  vec2 jitter = vec2(rand(seed), rand(seed)) - 0.5;
  vec2 uv = (gl_FragCoord.xy + jitter) / uResolution;

  Ray ray = getCameraRay(uv, uCamPos, uCamTarget, 45.0);
  vec3 col = tracePath(ray, seed);

  vec3 prev = texelFetch(uPrevSample, ivec2(gl_FragCoord.xy), 0).rgb;
  fragColor = vec4(prev * float(uFrame) + col, 1.0);
}

This entire scene — three diffuse spheres and one emissive area light — needs no textures, no acceleration structure, and no external assets. Swap the analytic sphere list for a small BVH over triangles and you have the core of a WebGPU or WebGL compute path tracer capable of arbitrary meshes.