⚡ Tutorial · WebGL · Shaders
📅 March 2026 ⏱ 25 min 🎓 Intermediate-Advanced

Gray-Scott Reaction Diffusion on the GPU

Turing patterns, spots, stripes, coral, coral growth, mitosis — all produced by two coupled PDEs running in a WebGL fragment shader. This tutorial walks from the math to a full GPU implementation with ping-pong framebuffers, achieving millions of cells at 60 fps in the browser.

1. The Gray-Scott Equations

The Gray-Scott model (1983, by Peter Gray and Stephen Scott) describes a continuously stirred tank reaction between two chemical species U and V:

∂u/∂t = D_u·∇²u − u·v² + F·(1−u)
∂v/∂t = D_v·∇²v + u·v² − (F+k)·v

u, v ∈ [0, 1] — normalised concentrations
D_u — diffusion coefficient of U (typically 2× D_v)
D_v — diffusion coefficient of V
F — feed rate (replenishment of U): 0.01–0.1
k — kill rate of V (in addition to feed outflow): 0.04–0.08

Reaction term u·v² implements autocatalysis:
each V molecule needs two V neighbours to catalyse conversion.
This is the Schlögl mechanism — the quadratic term creates instability.

The three terms in the U equation are:

What makes Turing patterns: The key is that V diffuses slower than U (D_v < D_u). V activates itself but U diffuses away faster, creating long-range inhibition of V. This "short-range activation + long-range inhibition" is Alan Turing's 1952 morphogenesis instability.

2. Discrete Laplacian on a Grid

On a 2D grid with spacing h = 1 pixel, the standard 5-point stencil Laplacian is:

∇²u[i,j] ≈ u[i−1,j] + u[i+1,j] + u[i,j−1] + u[i,j+1] − 4·u[i,j]

9-point stencil (more isotropic, better for patterns):
∇²u ≈ 0.25·(u[i±1,j] + u[i,j±1]) + 0.05·(diagonals) − 1.0·u[i,j]

Weighted: [0.05 0.2 0.05] ← centre weight = −1.0
[0.2 -1 0.2 ]
[0.05 0.2 0.05]
(sum of all weights = 0, as Laplacian must be)

The 9-point stencil produces more isotropic patterns — spots and stripes without visible grid-axis bias. Use this in the shader.

3. Pattern Parameter Space

The F-k parameter space is a zoo of qualitatively different patterns. Classic named regions:

Spots (pearls)

F=0.035, k=0.065
Isolated V-rich spots surrounded by U

Stripes (zebrafish)

F=0.060, k=0.062
Labyrinthine stripe patterns

Mitosis

F=0.028, k=0.062
Spots split as they grow

Coral

F=0.082, k=0.059
Branching structures

Worms

F=0.078, k=0.061
Moving worm-like patterns

Maze

F=0.029, k=0.057
Stationary maze-like labyrinths

Parameter Typical range Effect
D_u 0.16–0.22 U diffusivity — higher → larger scale structures
D_v 0.04–0.10 V diffusivity — usually D_v ≈ 0.5·D_u
F 0.010–0.085 Feed rate — lower F → sparser, higher F → denser
k 0.040–0.075 Kill rate — determines which pattern regime
dt 0.5–1.5 Timestep — higher = faster but can destabilise
Starting conditions: Initialise the entire grid to (u=1, v=0) with a few small seed regions of (u=0.5, v=0.25 + noise). The system evolves from these seeds after about 500–2000 timesteps.

4. GPU Architecture: Ping-Pong Framebuffers

The problem: each step of the simulation reads the current state and writes the next state. You can't read from and write to the same texture simultaneously on a GPU.

The solution: ping-pong — allocate two framebuffers (A and B). Each step:

// Ping-pong state
let frontBuffer = texA;  // GPU reads from this
let backFBO     = fboB;  // GPU writes to this

function swap() {
  [frontBuffer, backFBO] = [backBuffer.texture, frontFBO];
  // After swap: front ← what was just written; back ← old buffer to overwrite next step
}

The render pass is separate: a full-screen quad that reads from frontBuffer and maps concentration values to colours for display. It does NOT write to the simulation textures.

Each frame (typically N=4–16 sim steps per display frame):
1. For i in range(N): sim pass (read front → write back, then swap)
2. Render pass (read front → screen)

5. WebGL 2 Setup

// Create two pairs of (texture + framebuffer) for ping-pong
function createPingPongBuffers(gl, width, height) {
  function createTex() {
    const tex = gl.createTexture();
    gl.bindTexture(gl.TEXTURE_2D, tex);
    gl.texImage2D(gl.TEXTURE_2D, 0, gl.RG32F, width, height, 0,
                  gl.RG, gl.FLOAT, null);  // 2-channel float: [u, v]
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST);
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.REPEAT);
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.REPEAT);
    return tex;
  }
  function createFBO(tex) {
    const fbo = gl.createFramebuffer();
    gl.bindFramebuffer(gl.FRAMEBUFFER, fbo);
    gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0,
                            gl.TEXTURE_2D, tex, 0);
    gl.bindFramebuffer(gl.FRAMEBUFFER, null);
    return fbo;
  }
  const texA = createTex(), texB = createTex();
  return {
    texA, texB,
    fboA: createFBO(texA), fboB: createFBO(texB),
    front: texA, back: { fbo: createFBO(texB), tex: texB },
  };
}
RG32F: We use a two-channel 32-bit float texture — red channel = U concentration, green channel = V concentration. In WebGL 2 you must enable the EXT_color_buffer_float extension to render to float textures:
gl.getExtension('EXT_color_buffer_float');

6. Simulation Fragment Shader

#version 300 es
precision highp float;

uniform sampler2D uState;   // RG = [u, v]
uniform vec2  uResolution;
uniform float uF;           // feed rate
uniform float uK;           // kill rate
uniform float uDu;          // diffusion U
uniform float uDv;          // diffusion V
uniform float uDt;          // timestep

out vec4 outColor;

// 9-point isotropic Laplacian
vec2 laplacian(vec2 pos) {
  vec2 d = 1.0 / uResolution;
  vec2 c  = texture(uState, pos).rg;
  vec2 N  = texture(uState, pos + vec2( 0, d.y)).rg;
  vec2 S  = texture(uState, pos + vec2( 0,-d.y)).rg;
  vec2 E  = texture(uState, pos + vec2( d.x, 0)).rg;
  vec2 W  = texture(uState, pos + vec2(-d.x, 0)).rg;
  vec2 NE = texture(uState, pos + vec2( d.x, d.y)).rg;
  vec2 NW = texture(uState, pos + vec2(-d.x, d.y)).rg;
  vec2 SE = texture(uState, pos + vec2( d.x,-d.y)).rg;
  vec2 SW = texture(uState, pos + vec2(-d.x,-d.y)).rg;
  return 0.2*(N+S+E+W) + 0.05*(NE+NW+SE+SW) - 1.0*c;
}

void main() {
  vec2 uv = gl_FragCoord.xy / uResolution;
  vec2 state = texture(uState, uv).rg;
  float u = state.r;
  float v = state.g;

  vec2 lap = laplacian(uv);
  float reaction = u * v * v;

  float du = uDu * lap.r - reaction + uF * (1.0 - u);
  float dv = uDv * lap.g + reaction - (uF + uK) * v;

  float newU = clamp(u + du * uDt, 0.0, 1.0);
  float newV = clamp(v + dv * uDt, 0.0, 1.0);

  outColor = vec4(newU, newV, 0.0, 1.0);
}
Why clamp? Without clamping, floating point errors can push concentrations outside [0,1] after thousands of timesteps. The reaction term is proportional to u·v² — out-of-range values produce runaway growth. Always clamp.

7. Colour Mapping Shader

#version 300 es
precision highp float;
uniform sampler2D uState;
out vec4 fragColor;

// Inferno-inspired colour ramp for V concentration
vec3 inferno(float t) {
  const vec3 c0 = vec3(0.0002, 0.0016, 0.0139);
  const vec3 c1 = vec3(0.1140, 0.0631, 0.4149);
  const vec3 c2 = vec3(0.5419, 0.1773, 0.3773);
  const vec3 c3 = vec3(0.8785, 0.3835, 0.1551);
  const vec3 c4 = vec3(0.9891, 0.7945, 0.2824);
  t = clamp(t, 0.0, 1.0);
  if (t < 0.25) return mix(c0, c1, t * 4.0);
  if (t < 0.50) return mix(c1, c2, (t - 0.25) * 4.0);
  if (t < 0.75) return mix(c2, c3, (t - 0.50) * 4.0);
  return mix(c3, c4, (t - 0.75) * 4.0);
}

void main() {
  vec2 uv = gl_FragCoord.xy / vec2(textureSize(uState, 0));
  float v = texture(uState, uv).g;
  fragColor = vec4(inferno(v), 1.0);
}

8. Full JavaScript Driver

const canvas = document.getElementById('canvas');
const gl = canvas.getContext('webgl2');
gl.getExtension('EXT_color_buffer_float');

const W = 512, H = 512;
canvas.width  = W;
canvas.height = H;

// --- Compile shaders (helpers omitted for brevity) ---
const simProgram    = compileProgram(gl, simVertSrc, simFragSrc);
const renderProgram = compileProgram(gl, renderVertSrc, renderFragSrc);

// --- Full-screen quad VAO ---
const quadVAO = makeQuadVAO(gl);

// --- Ping-pong buffers ---
let { texA, texB, fboA, fboB } = createPingPongBuffers(gl, W, H);
let front = texA, frontFBO = fboA;
let back  = texB, backFBO  = fboB;

function swap() {
  [front, back]       = [back,  front];
  [frontFBO, backFBO] = [backFBO, frontFBO];
}

// --- Seed initial state ---
const seed = new Float32Array(W * H * 2).fill(0);
for (let i = 0; i < W * H; i++) seed[i * 2] = 1.0; // u=1 everywhere
for (let cy = 220; cy < 260; cy++) {          // seed patch
  for (let cx = 220; cx < 260; cx++) {
    const i = cy * W + cx;
    seed[i * 2]     = 0.50;
    seed[i * 2 + 1] = 0.25 + 0.01 * (Math.random() - 0.5);
  }
}
uploadTexture(gl, front, W, H, seed);

// --- Simulation parameters ---
const params = { F: 0.035, k: 0.065, Du: 0.16, Dv: 0.08, dt: 1.0 };
const STEPS_PER_FRAME = 8;

function simStep() {
  gl.useProgram(simProgram);
  gl.viewport(0, 0, W, H);
  gl.bindFramebuffer(gl.FRAMEBUFFER, backFBO);
  gl.activeTexture(gl.TEXTURE0);
  gl.bindTexture(gl.TEXTURE_2D, front);
  gl.uniform1i(gl.getUniformLocation(simProgram, 'uState'), 0);
  gl.uniform2f(gl.getUniformLocation(simProgram, 'uResolution'), W, H);
  gl.uniform1f(gl.getUniformLocation(simProgram, 'uF'), params.F);
  gl.uniform1f(gl.getUniformLocation(simProgram, 'uK'), params.k);
  gl.uniform1f(gl.getUniformLocation(simProgram, 'uDu'), params.Du);
  gl.uniform1f(gl.getUniformLocation(simProgram, 'uDv'), params.Dv);
  gl.uniform1f(gl.getUniformLocation(simProgram, 'uDt'), params.dt);
  drawQuad(gl, quadVAO);
  swap();
}

function render() {
  gl.useProgram(renderProgram);
  gl.viewport(0, 0, canvas.width, canvas.height);
  gl.bindFramebuffer(gl.FRAMEBUFFER, null);
  gl.activeTexture(gl.TEXTURE0);
  gl.bindTexture(gl.TEXTURE_2D, front);
  gl.uniform1i(gl.getUniformLocation(renderProgram, 'uState'), 0);
  drawQuad(gl, quadVAO);
}

function loop() {
  for (let i = 0; i < STEPS_PER_FRAME; i++) simStep();
  render();
  requestAnimationFrame(loop);
}
loop();
Performance: On a mid-range desktop GPU, a 512×512 grid running 8 sim steps per frame completes at 60 fps. For 1024×1024 you may need to reduce to 4 steps/frame. The bottleneck is usually texture bandwidth, not compute — so maximising texture cache efficiency (powers-of-two textures, NEAREST filtering) matters.

Next Steps