Introduction to WebGPU: First Steps
WebGPU is the successor to WebGL — a modern graphics and compute API exposed directly to the browser, modeled after native APIs like Vulkan, Metal, and Direct3D 12. It replaces WebGL's implicit state machine with explicit objects: adapters, devices, pipelines, and bind groups. This tutorial walks through every concept you need to render your first triangle and run your first compute shader.
1. Why WebGPU? From WebGL to a Modern API
WebGL is a JavaScript binding to OpenGL ES, an API designed in the era of implicit global state — bind a texture, bind a buffer, issue a draw call, and hope nothing else in your codebase changed the bound state in between. WebGPU throws that model out. It is built around explicit, immutable objects: you describe a pipeline once (shaders, vertex layout, blend state, depth test) and reuse it, rather than re-issuing dozens of state-setting calls every frame.
Three practical advantages fall out of this design:
- Lower CPU overhead: command encoding is cheaper because validation happens once, at pipeline creation, not on every draw call.
- First-class compute: WebGPU exposes general-purpose compute shaders as a core feature, not an extension — useful for particle systems, physics, and GPGPU workloads directly in the browser.
- Multi-threaded command recording: command buffers can be built off the main thread and submitted later, which WebGL's single implicit context could never safely support.
if (!navigator.gpu) and fall back to WebGL/Three.js's
WebGL renderer for older browsers.
2. Adapters and Devices
WebGPU splits "the GPU" into two objects. A GPUAdapter
represents a physical or software GPU available to the browser — you
request one and inspect its limits and
features. A GPUDevice is the logical
connection you actually create resources and submit work through.
async function initWebGPU() {
if (!navigator.gpu) {
throw new Error("WebGPU is not supported in this browser.");
}
// Request a physical adapter — you can hint high-performance vs low-power
const adapter = await navigator.gpu.requestAdapter({
powerPreference: "high-performance",
});
if (!adapter) throw new Error("No suitable GPUAdapter found.");
// Request the logical device — the object you actually use
const device = await adapter.requestDevice();
// Surface uncaptured errors instead of failing silently
device.addEventListener("uncapturederror", (event) => {
console.error("WebGPU error:", event.error.message);
});
return { adapter, device };
}
device.lost (a Promise) and be
ready to reinitialize your pipeline on recovery in production code.
3. Configuring the Canvas Context
To draw into a <canvas>, request a
"webgpu" context and configure it with the device and a
texture format. Unlike WebGL, you must explicitly tell WebGPU which
pixel format to use — usually whatever the browser's preferred
swap-chain format is, for best performance.
const canvas = document.querySelector("canvas");
const context = canvas.getContext("webgpu");
const format = navigator.gpu.getPreferredCanvasFormat();
context.configure({
device,
format,
alphaMode: "opaque",
});
Every frame, you'll call
context.getCurrentTexture() to grab the texture the
browser wants you to render into, and wrap it in a
GPUTextureView for the render pass.
4. WGSL: The WebGPU Shading Language
WebGPU does not use GLSL. It defines its own shading language, WGSL (WebGPU Shading Language) — a statically typed, Rust-flavored syntax designed to map cleanly onto SPIR-V, MSL, and HLSL under the hood. Vertex and fragment stages can live in the same module, distinguished by attributes.
struct VertexOut {
@builtin(position) position : vec4<f32>,
@location(0) color : vec3<f32>,
};
@vertex
fn vs_main(
@location(0) pos : vec2<f32>,
@location(1) color : vec3<f32>
) -> VertexOut {
var out : VertexOut;
out.position = vec4<f32>(pos, 0.0, 1.0);
out.color = color;
return out;
}
@fragment
fn fs_main(in : VertexOut) -> @location(0) vec4<f32> {
return vec4<f32>(in.color, 1.0);
}
Key syntax notes coming from GLSL: types are explicit
(vec2<f32>, not just vec2),
@location(n) replaces layout(location = n),
and @builtin(position) marks the clip-space output —
equivalent to GLSL's implicit gl_Position.
5. Buffers and the Vertex Layout
GPU buffers are created with a fixed size and a set of allowed
usage flags. To upload vertex data you create a buffer
flagged VERTEX | COPY_DST, then write into it, then
describe the byte layout so the pipeline knows how to interpret each
vertex's attributes.
// Interleaved: x, y, r, g, b per vertex (5 floats = 20 bytes)
const vertices = new Float32Array([
0.0, 0.5, 1, 0, 0,
-0.5, -0.5, 0, 1, 0,
0.5, -0.5, 0, 0, 1,
]);
const vertexBuffer = device.createBuffer({
size: vertices.byteLength,
usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST,
});
device.queue.writeBuffer(vertexBuffer, 0, vertices);
const vertexLayout = {
arrayStride: 5 * 4, // 5 floats × 4 bytes
attributes: [
{ shaderLocation: 0, offset: 0, format: "float32x2" }, // position
{ shaderLocation: 1, offset: 2 * 4, format: "float32x3" }, // color
],
};
gl.bufferData, WebGPU buffers are opaque GPU memory. You
write into them with queue.writeBuffer(), or map them
for direct CPU access with mapAsync() when the buffer
was created with MAP_WRITE/MAP_READ usage.
6. Building a Render Pipeline
A GPURenderPipeline bundles the shader module, vertex
layout, primitive topology, and color target format into a single
validated, immutable object. Compiling it is relatively expensive, so
you create it once and reuse it across frames — never inside your
render loop.
const shaderModule = device.createShaderModule({ code: wgslSource });
const pipeline = device.createRenderPipeline({
layout: "auto",
vertex: {
module: shaderModule,
entryPoint: "vs_main",
buffers: [vertexLayout],
},
fragment: {
module: shaderModule,
entryPoint: "fs_main",
targets: [{ format }],
},
primitive: {
topology: "triangle-list",
cullMode: "back",
},
});
7. Encoding and Submitting Commands
Draw calls in WebGPU aren't issued directly — they're recorded into a
GPUCommandEncoder, wrapped by a
GPURenderPassEncoder that describes which textures to
clear and write to. The finished command buffer is then submitted to
the device's queue.
function frame() {
const encoder = device.createCommandEncoder();
const textureView = context.getCurrentTexture().createView();
const pass = encoder.beginRenderPass({
colorAttachments: [{
view: textureView,
clearValue: { r: 0.05, g: 0.05, b: 0.08, a: 1 },
loadOp: "clear",
storeOp: "store",
}],
});
pass.setPipeline(pipeline);
pass.setVertexBuffer(0, vertexBuffer);
pass.draw(3); // 3 vertices, 1 instance
pass.end();
device.queue.submit([encoder.finish()]);
requestAnimationFrame(frame);
}
requestAnimationFrame(frame);
8. Uniforms and Bind Groups
WebGPU has no notion of a "uniform location" you set by name at
runtime. Instead, shader resources (uniform buffers, textures,
samplers) are grouped into a GPUBindGroup, whose layout
must match a @group/@binding pair declared
in WGSL.
const uniformBuffer = device.createBuffer({
size: 64, // 4x4 matrix of f32 = 16 * 4 bytes
usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,
});
const bindGroup = device.createBindGroup({
layout: pipeline.getBindGroupLayout(0),
entries: [
{ binding: 0, resource: { buffer: uniformBuffer } },
],
});
// Inside the render pass, before draw():
pass.setBindGroup(0, bindGroup);
9. Compute Shaders and Compute Passes
Compute shaders run arbitrary parallel work with no rasterization
involved — ideal for particle simulation, physics integration, or
image processing entirely on the GPU. A compute shader declares a
workgroup_size and is dispatched over a 3D grid of
workgroups.
// WGSL: double every element of a storage buffer
@group(0) @binding(0) var<storage, read_write> data : array<f32>;
@compute @workgroup_size(64)
fn main(@builtin(global_invocation_id) id : vec3<u32>) {
let i = id.x;
if (i >= arrayLength(&data)) { return; }
data[i] = data[i] * 2.0;
}
const computePipeline = device.createComputePipeline({
layout: "auto",
compute: { module: computeShaderModule, entryPoint: "main" },
});
const encoder = device.createCommandEncoder();
const pass = encoder.beginComputePass();
pass.setPipeline(computePipeline);
pass.setBindGroup(0, computeBindGroup);
pass.dispatchWorkgroups(Math.ceil(elementCount / 64)); // grid of workgroups
pass.end();
device.queue.submit([encoder.finish()]);
MAP_READ | COPY_DST via
encoder.copyBufferToBuffer(), then
await buffer.mapAsync(GPUMapMode.READ).
10. Common Pitfalls and Debugging
-
Forgetting
storeOp: "store": if you set it to"discard", the rendered frame is thrown away after the pass — nothing appears on screen even though no error is thrown. -
Buffer size mismatches:
writeBuffer()silently fails validation if the source data is larger than the destination buffer. Always size buffers to the exact byte length you'll write. -
Wrong
arrayStride: a stride that doesn't match your interleaved data layout produces garbled or invisible geometry, not a crash — check byte offsets carefully. -
Creating pipelines every frame: pipeline creation
is a compile step. Doing it inside
frame()instead of once at startup will tank performance and may cause visible stutter. -
Ignoring
uncapturederror: WebGPU validation errors don't throw synchronously by default — listen for the device's error event or wrap calls indevice.pushErrorScope()/popErrorScope()during development.
storage buffers to feed thousands of instances into a
single draw call via @builtin(instance_index).