1. Why WebGPU? From WebGL to a Modern API

WebGL is a JavaScript binding to OpenGL ES, an API designed in the era of implicit global state — bind a texture, bind a buffer, issue a draw call, and hope nothing else in your codebase changed the bound state in between. WebGPU throws that model out. It is built around explicit, immutable objects: you describe a pipeline once (shaders, vertex layout, blend state, depth test) and reuse it, rather than re-issuing dozens of state-setting calls every frame.

Three practical advantages fall out of this design:

Lower CPU overhead: command encoding is cheaper because validation happens once, at pipeline creation, not on every draw call.
First-class compute: WebGPU exposes general-purpose compute shaders as a core feature, not an extension — useful for particle systems, physics, and GPGPU workloads directly in the browser.
Multi-threaded command recording: command buffers can be built off the main thread and submitted later, which WebGL's single implicit context could never safely support.

Browser support: WebGPU ships by default in Chrome, Edge, and Firefox (recent versions) and behind a flag on Safari Technology Preview at the time of writing. Always feature-detect with if (!navigator.gpu) and fall back to WebGL/Three.js's WebGL renderer for older browsers.

2. Adapters and Devices

WebGPU splits "the GPU" into two objects. A GPUAdapter represents a physical or software GPU available to the browser — you request one and inspect its limits and features. A GPUDevice is the logical connection you actually create resources and submit work through.

async function initWebGPU() {
  if (!navigator.gpu) {
    throw new Error("WebGPU is not supported in this browser.");
  }

  // Request a physical adapter — you can hint high-performance vs low-power
  const adapter = await navigator.gpu.requestAdapter({
    powerPreference: "high-performance",
  });
  if (!adapter) throw new Error("No suitable GPUAdapter found.");

  // Request the logical device — the object you actually use
  const device = await adapter.requestDevice();

  // Surface uncaptured errors instead of failing silently
  device.addEventListener("uncapturederror", (event) => {
    console.error("WebGPU error:", event.error.message);
  });

  return { adapter, device };
}

Adapters can be lost. A device can become unusable if the GPU driver crashes or the tab is backgrounded on some platforms. Listen to device.lost (a Promise) and be ready to reinitialize your pipeline on recovery in production code.

3. Configuring the Canvas Context

To draw into a <canvas>, request a "webgpu" context and configure it with the device and a texture format. Unlike WebGL, you must explicitly tell WebGPU which pixel format to use — usually whatever the browser's preferred swap-chain format is, for best performance.

const canvas = document.querySelector("canvas");
const context = canvas.getContext("webgpu");

const format = navigator.gpu.getPreferredCanvasFormat();

context.configure({
  device,
  format,
  alphaMode: "opaque",
});

Every frame, you'll call context.getCurrentTexture() to grab the texture the browser wants you to render into, and wrap it in a GPUTextureView for the render pass.

4. WGSL: The WebGPU Shading Language

WebGPU does not use GLSL. It defines its own shading language, WGSL (WebGPU Shading Language) — a statically typed, Rust-flavored syntax designed to map cleanly onto SPIR-V, MSL, and HLSL under the hood. Vertex and fragment stages can live in the same module, distinguished by attributes.

struct VertexOut {
  @builtin(position) position : vec4<f32>,
  @location(0) color : vec3<f32>,
};

@vertex
fn vs_main(
  @location(0) pos : vec2<f32>,
  @location(1) color : vec3<f32>
) -> VertexOut {
  var out : VertexOut;
  out.position = vec4<f32>(pos, 0.0, 1.0);
  out.color = color;
  return out;
}

@fragment
fn fs_main(in : VertexOut) -> @location(0) vec4<f32> {
  return vec4<f32>(in.color, 1.0);
}

Key syntax notes coming from GLSL: types are explicit (vec2<f32>, not just vec2), @location(n) replaces layout(location = n), and @builtin(position) marks the clip-space output — equivalent to GLSL's implicit gl_Position.

5. Buffers and the Vertex Layout

GPU buffers are created with a fixed size and a set of allowed usage flags. To upload vertex data you create a buffer flagged VERTEX | COPY_DST, then write into it, then describe the byte layout so the pipeline knows how to interpret each vertex's attributes.

// Interleaved: x, y, r, g, b per vertex (5 floats = 20 bytes)
const vertices = new Float32Array([
  0.0,  0.5,  1, 0, 0,
  -0.5, -0.5,  0, 1, 0,
  0.5,  -0.5,  0, 0, 1,
]);

const vertexBuffer = device.createBuffer({
  size: vertices.byteLength,
  usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST,
});
device.queue.writeBuffer(vertexBuffer, 0, vertices);

const vertexLayout = {
  arrayStride: 5 * 4, // 5 floats × 4 bytes
  attributes: [
    { shaderLocation: 0, offset: 0,     format: "float32x2" }, // position
    { shaderLocation: 1, offset: 2 * 4, format: "float32x3" }, // color
  ],
};

No client-side arrays: unlike WebGL's gl.bufferData, WebGPU buffers are opaque GPU memory. You write into them with queue.writeBuffer(), or map them for direct CPU access with mapAsync() when the buffer was created with MAP_WRITE/MAP_READ usage.

6. Building a Render Pipeline

A GPURenderPipeline bundles the shader module, vertex layout, primitive topology, and color target format into a single validated, immutable object. Compiling it is relatively expensive, so you create it once and reuse it across frames — never inside your render loop.

const shaderModule = device.createShaderModule({ code: wgslSource });

const pipeline = device.createRenderPipeline({
  layout: "auto",
  vertex: {
    module: shaderModule,
    entryPoint: "vs_main",
    buffers: [vertexLayout],
  },
  fragment: {
    module: shaderModule,
    entryPoint: "fs_main",
    targets: [{ format }],
  },
  primitive: {
    topology: "triangle-list",
    cullMode: "back",
  },
});

7. Encoding and Submitting Commands

Draw calls in WebGPU aren't issued directly — they're recorded into a GPUCommandEncoder, wrapped by a GPURenderPassEncoder that describes which textures to clear and write to. The finished command buffer is then submitted to the device's queue.

function frame() {
  const encoder = device.createCommandEncoder();
  const textureView = context.getCurrentTexture().createView();

  const pass = encoder.beginRenderPass({
    colorAttachments: [{
      view: textureView,
      clearValue: { r: 0.05, g: 0.05, b: 0.08, a: 1 },
      loadOp: "clear",
      storeOp: "store",
    }],
  });

  pass.setPipeline(pipeline);
  pass.setVertexBuffer(0, vertexBuffer);
  pass.draw(3); // 3 vertices, 1 instance
  pass.end();

  device.queue.submit([encoder.finish()]);
  requestAnimationFrame(frame);
}
requestAnimationFrame(frame);

Batching commands: because encoders are cheap objects, you can record multiple render passes, or interleave compute and render passes, in a single command buffer before submitting — the driver sees the whole frame's work at once.

8. Uniforms and Bind Groups

WebGPU has no notion of a "uniform location" you set by name at runtime. Instead, shader resources (uniform buffers, textures, samplers) are grouped into a GPUBindGroup, whose layout must match a @group/@binding pair declared in WGSL.

Model-view-projection uniform, 64 bytes (a 4×4 f32 matrix): mvp = P · V · M WGSL: @group(0) @binding(0) var<uniform> mvp : mat4x4<f32>; JS: uniformBuffer = device.createBuffer({ size: 64, usage: UNIFORM | COPY_DST }) device.queue.writeBuffer(uniformBuffer, 0, mvpMatrixData)

const uniformBuffer = device.createBuffer({
  size: 64, // 4x4 matrix of f32 = 16 * 4 bytes
  usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,
});

const bindGroup = device.createBindGroup({
  layout: pipeline.getBindGroupLayout(0),
  entries: [
    { binding: 0, resource: { buffer: uniformBuffer } },
  ],
});

// Inside the render pass, before draw():
pass.setBindGroup(0, bindGroup);

9. Compute Shaders and Compute Passes

Compute shaders run arbitrary parallel work with no rasterization involved — ideal for particle simulation, physics integration, or image processing entirely on the GPU. A compute shader declares a workgroup_size and is dispatched over a 3D grid of workgroups.

// WGSL: double every element of a storage buffer
@group(0) @binding(0) var<storage, read_write> data : array<f32>;

@compute @workgroup_size(64)
fn main(@builtin(global_invocation_id) id : vec3<u32>) {
  let i = id.x;
  if (i >= arrayLength(&data)) { return; }
  data[i] = data[i] * 2.0;
}

const computePipeline = device.createComputePipeline({
  layout: "auto",
  compute: { module: computeShaderModule, entryPoint: "main" },
});

const encoder = device.createCommandEncoder();
const pass = encoder.beginComputePass();
pass.setPipeline(computePipeline);
pass.setBindGroup(0, computeBindGroup);
pass.dispatchWorkgroups(Math.ceil(elementCount / 64)); // grid of workgroups
pass.end();
device.queue.submit([encoder.finish()]);

Reading results back: storage buffers live in GPU memory. To inspect the result on the CPU, copy it into a buffer created with MAP_READ | COPY_DST via encoder.copyBufferToBuffer(), then await buffer.mapAsync(GPUMapMode.READ).

10. Common Pitfalls and Debugging

Forgetting storeOp: "store": if you set it to "discard", the rendered frame is thrown away after the pass — nothing appears on screen even though no error is thrown.
Buffer size mismatches: writeBuffer() silently fails validation if the source data is larger than the destination buffer. Always size buffers to the exact byte length you'll write.
Wrong arrayStride: a stride that doesn't match your interleaved data layout produces garbled or invisible geometry, not a crash — check byte offsets carefully.
Creating pipelines every frame: pipeline creation is a compile step. Doing it inside frame() instead of once at startup will tank performance and may cause visible stutter.
Ignoring uncapturederror: WebGPU validation errors don't throw synchronously by default — listen for the device's error event or wrap calls in device.pushErrorScope() / popErrorScope() during development.

Next steps: once the triangle renders, try replacing the static vertex buffer with a uniform-driven rotation matrix, add a depth texture for 3D geometry, and experiment with storage buffers to feed thousands of instances into a single draw call via @builtin(instance_index).