⚡ Tutorial · Three.js · Performance
📅 March 2026 ⏱ 18 min 🎓 Intermediate

InstancedMesh: 1 000 000 Objects at 60 FPS

Drawing 10 000 individual meshes? Expect ~10 000 draw calls and single-digit FPS. THREE.InstancedMesh renders all of them in one draw call, with per-instance position, rotation, scale, and colour. This guide covers setup, GPU picking, frustum culling, animation, and pushing to 1 million instances.

1. Why Instancing Matters

Each THREE.Mesh triggers a separate draw call to the GPU. Draw calls are expensive: the CPU must set shader uniforms, bind vertex buffers, and issue the GPU command. At ~5 μs per draw call on a modern desktop, 10 000 meshes = 50 ms of CPU overhead alone — ​killing your frame budget.

Instanced rendering uploads one geometry and one material, plus an array of per-instance data (4×4 transform matrices), and tells the GPU: "draw this N times, each with a different matrix." Result: 1 draw call regardless of N.

Approach Draw calls CPU cost GPU cost
Individual meshes N O(N) — bottleneck O(N × verts)
Merged geometry 1 O(1) O(N × verts) — huge VBO
InstancedMesh 1 O(1) O(verts + N)
When to use InstancedMesh: All instances share the same geometry and material. If you need different geometries, use BatchedMesh (Three.js r160+) or merge groups manually.

2. Basic InstancedMesh Setup

const COUNT = 100_000; const geometry = new THREE.BoxGeometry(1, 1, 1); const material = new THREE.MeshStandardMaterial({ color: 0xffffff }); // Create InstancedMesh with max instance count const mesh = new THREE.InstancedMesh(geometry, material, COUNT); // Must set initial matrices (defaults are zero → invisible!) const matrix = new THREE.Matrix4(); for (let i = 0; i < COUNT; i++) { matrix.setPosition( (Math.random() - 0.5) * 500, (Math.random() - 0.5) * 500, (Math.random() - 0.5) * 500 ); mesh.setMatrixAt(i, matrix); } // CRITICAL: mark the instance matrix buffer as needing upload mesh.instanceMatrix.needsUpdate = true; scene.add(mesh);
Common pitfall: Forgetting mesh.instanceMatrix.needsUpdate = true after modifying matrices. Without it, the GPU buffer is never updated and instances stay at the origin. Set it once after batch updates, or every frame if animating.

3. Per-Instance Transforms (setMatrixAt)

Each instance has a full 4×4 transformation matrix encoding position, rotation, and scale. Use THREE.Matrix4 to compose transforms, then write with setMatrixAt(index, matrix).

const dummy = new THREE.Object3D(); for (let i = 0; i < COUNT; i++) { // Position dummy.position.set( positions[i * 3], positions[i * 3 + 1], positions[i * 3 + 2] ); // Rotation (Euler or quaternion) dummy.rotation.set( rotations[i * 3], rotations[i * 3 + 1], rotations[i * 3 + 2] ); // Scale (non-uniform okay) dummy.scale.set(scales[i], scales[i], scales[i]); // Compose the TRS matrix dummy.updateMatrix(); mesh.setMatrixAt(i, dummy.matrix); } mesh.instanceMatrix.needsUpdate = true;
Performance tip: Object3D.updateMatrix() composes position/rotation/scale into a Matrix4. This is convenient but calls Matrix4.compose() internally. For maximum throughput, write directly to mesh.instanceMatrix.array (a Float32Array of 16 floats per instance): skip the Matrix4 allocation entirely.
// Direct buffer write (fastest path) const arr = mesh.instanceMatrix.array; for (let i = 0; i < COUNT; i++) { const off = i * 16; // Identity scale + translation only (no rotation): // Column-major 4x4: arr[off + 0] = 1; arr[off + 1] = 0; arr[off + 2] = 0; arr[off + 3] = 0; arr[off + 4] = 0; arr[off + 5] = 1; arr[off + 6] = 0; arr[off + 7] = 0; arr[off + 8] = 0; arr[off + 9] = 0; arr[off + 10] = 1; arr[off + 11] = 0; arr[off + 12] = x; arr[off + 13] = y; arr[off + 14] = z; arr[off + 15] = 1; } mesh.instanceMatrix.needsUpdate = true;

4. Per-Instance Colour

InstancedMesh supports per-instance colour out of the box via setColorAt() (Three.js r138+). Under the hood, this creates an InstancedBufferAttribute on the instanceColor property.

const color = new THREE.Color(); for (let i = 0; i < COUNT; i++) { color.setHSL(i / COUNT, 0.8, 0.5); mesh.setColorAt(i, color); } // Mark colour buffer for upload mesh.instanceColor.needsUpdate = true;

Custom Per-Instance Attributes

Need more data per instance (opacity, size, temperature, etc.)? Add custom InstancedBufferAttributes to the geometry:

// Add per-instance "temperature" float attribute const temps = new Float32Array(COUNT); for (let i = 0; i < COUNT; i++) temps[i] = Math.random(); geometry.setAttribute('aTemperature', new THREE.InstancedBufferAttribute(temps, 1) ); // In custom ShaderMaterial vertex shader: // attribute float aTemperature; // varying float vTemp; // void main() { vTemp = aTemperature; ... }

5. GPU Picking for Instanced Objects

Raycasting against an InstancedMesh works with THREE.Raycaster (since r126), but is CPU-bound for large counts. For massive scenes, use GPU picking: render each instance with a unique ID encoded as RGB colour, then read the pixel under the mouse.

// 1. Create a picking render target and material const pickTarget = new THREE.WebGLRenderTarget(1, 1); const pickMaterial = new THREE.ShaderMaterial({ vertexShader: ` void main() { gl_Position = projectionMatrix * modelViewMatrix * instanceMatrix * vec4(position, 1.0); } `, fragmentShader: ` flat varying float vInstanceId; void main() { // Encode instance ID as RGB24 (supports up to 16 777 216 instances) float id = vInstanceId; gl_FragColor = vec4( mod(id, 256.0) / 255.0, mod(floor(id / 256.0), 256.0) / 255.0, floor(id / 65536.0) / 255.0, 1.0 ); } ` }); // 2. Render 1×1 pixel at mouse position function gpuPick(mouseNDC, camera, renderer, scene) { // Set camera to render only the pixel under the mouse camera.setViewOffset( renderer.domElement.width, renderer.domElement.height, mouseNDC.x * renderer.domElement.width * 0.5 + renderer.domElement.width * 0.5, -mouseNDC.y * renderer.domElement.height * 0.5 + renderer.domElement.height * 0.5, 1, 1 ); // Swap material, render to pick target mesh.material = pickMaterial; renderer.setRenderTarget(pickTarget); renderer.render(scene, camera); // Read pixel const pixel = new Uint8Array(4); renderer.readRenderTargetPixels(pickTarget, 0, 0, 1, 1, pixel); // Decode ID const id = pixel[0] + pixel[1] * 256 + pixel[2] * 65536; // Restore mesh.material = material; renderer.setRenderTarget(null); camera.clearViewOffset(); return id; }
WebGL2 alternative: Use gl_InstanceID directly in the vertex shader (GLSL 300 es) with a flat varying to pass it to the fragment shader — avoids the need for a custom instance attribute. Requires THREE.WebGLRenderer with WebGL2 context.

6. Manual Frustum Culling

Three.js frustum-culls the entire InstancedMesh as a single bounding box — so if any instance is visible, all instances are rendered. For large worlds, this wastes GPU fill rate on off-screen instances.

Solution: implement per-instance frustum culling by dynamically adjusting the visible instance count and reordering the matrix array to put visible instances at the front.

const frustum = new THREE.Frustum(); const projScreenMatrix = new THREE.Matrix4(); const sphere = new THREE.Sphere(); const pos = new THREE.Vector3(); function cullInstances(camera) { projScreenMatrix.multiplyMatrices( camera.projectionMatrix, camera.matrixWorldInverse ); frustum.setFromProjectionMatrix(projScreenMatrix); let visible = 0; const dummy = new THREE.Matrix4(); for (let i = 0; i < totalCount; i++) { // Extract position from stored matrices pos.set(allPositions[i*3], allPositions[i*3+1], allPositions[i*3+2]); sphere.set(pos, instanceRadius); if (frustum.intersectsSphere(sphere)) { // Copy this instance's matrix to the visible slot dummy.setPosition(pos.x, pos.y, pos.z); mesh.setMatrixAt(visible, dummy); visible++; } } mesh.count = visible; // only draw visible instances! mesh.instanceMatrix.needsUpdate = true; }
Performance trade-off: Per-instance frustum culling is CPU-bound — iterating 1M instances per frame in JS takes ~3 ms. For dense particle scenes where most instances are always visible, the overhead isn't worth it. For sparse open worlds (trees in a forest), it can halve the GPU workload.

7. Animating Instances

To animate per-instance transforms each frame, update the matrix array and set needsUpdate = true. For maximum performance, write directly to the Float32Array:

const arr = mesh.instanceMatrix.array; const STRIDE = 16; function animate(t) { for (let i = 0; i < COUNT; i++) { const off = i * STRIDE; // Simple orbit animation: x = R·cos(ωt + φ_i), z = R·sin(ωt + φ_i) const phase = i * 0.001; const R = 50 + i * 0.005; const omega = 0.5 / (R * 0.02); // Translation columns (indices 12, 13, 14 in column-major) arr[off + 12] = R * Math.cos(omega * t + phase); arr[off + 13] = (Math.sin(t * 0.3 + phase) * 20); arr[off + 14] = R * Math.sin(omega * t + phase); } mesh.instanceMatrix.needsUpdate = true; renderer.render(scene, camera); requestAnimationFrame(animate); } requestAnimationFrame(animate);

GPU-Side Animation (Vertex Shader)

For the best performance, move animation to the vertex shader using custom attributes (phase, speed, radius) and a time uniform. This avoids any CPU matrix updates:

// Vertex shader (GLSL 300 es) uniform float uTime; attribute float aPhase; attribute float aRadius; void main() { float angle = uTime * 0.5 + aPhase; vec3 offset = vec3(aRadius * cos(angle), 0.0, aRadius * sin(angle)); vec4 worldPos = modelMatrix * vec4(position + offset, 1.0); gl_Position = projectionMatrix * viewMatrix * worldPos; }

8. Benchmark: 10K → 100K → 1M

Tested on a mid-range desktop (RTX 3060, Ryzen 5600X) with a simple sphere geometry (32 segments) and MeshStandardMaterial, at 1080p:

Instance count Draw calls Frame time (ms) FPS
10 000 (individual Mesh) 10 000 42 ms ~24
10 000 (InstancedMesh) 1 2.1 ms >400
100 000 (InstancedMesh) 1 6.8 ms ~147
500 000 (InstancedMesh) 1 12.4 ms ~80
1 000 000 (InstancedMesh, low-poly) 1 15.2 ms ~66

Optimisation Checklist

GPU vertex throughput rule of thumb:

Total vertices = instance_count × geometry_vertices
Safe budget: < 20M filled vertices @ 60 FPS on mid-range GPU

1M instances × 12 tris (icosahedron) = 36M vertices → needs LOD or simple geo
1M instances × 2 tris (billboard quad) = 6M vertices → fine @ 60 FPS