Simulation Performance Optimization
Browser simulations often start fast—then slow down as objects
multiply. This tutorial covers practical techniques to sustain 60 fps:
InstancedMesh for thousands of objects, TypedArrays to
reduce GC pressure, spatial hashing for broad-phase collision, Web
Workers to move physics off the main thread, and GPU-side tricks.
Know Your Bottleneck First
Before optimising, measure. Chrome DevTools Performance panel and the three.js Stats helper tell you whether you are CPU-bound or GPU-bound:
import Stats from 'https://cdn.jsdelivr.net/npm/three@0.168/examples/jsm/libs/stats.module.js';
const stats = new Stats();
stats.showPanel(0); // 0 = fps, 1 = ms, 2 = mb
document.body.appendChild(stats.dom);
// In animate():
stats.begin();
renderer.render(scene, camera);
stats.end();
| Symptom | Likely bottleneck | Fix direction |
|---|---|---|
| JS frame time > 10 ms | CPU — JS physics / updates | TypedArrays, WASM, Web Workers |
| Many draw calls (>500) | CPU — render thread batching | InstancedMesh, merge geometry |
| High GPU usage, low CPU | GPU — fragment complexity | Reduce shader cost, lower resolution |
| GC spikes (ms chart jitters) | JS heap allocations | Object pooling, TypedArrays |
InstancedMesh — One Draw Call per Object Type
Rendering 10,000 separate Mesh objects = 10,000 draw
calls. InstancedMesh renders all of them in
one draw call:
const COUNT = 10_000;
const geo = new THREE.SphereGeometry(0.1, 8, 8);
const mat = new THREE.MeshStandardMaterial({ color: 0x2299ff });
const iMesh = new THREE.InstancedMesh(geo, mat, COUNT);
scene.add(iMesh);
const dummy = new THREE.Object3D();
const positions = new Float32Array(COUNT * 3); // x,y,z per instance
// Initialize positions
for (let i = 0; i < COUNT; i++) {
positions[i*3+0] = (Math.random() - 0.5) * 20;
positions[i*3+1] = (Math.random() - 0.5) * 20;
positions[i*3+2] = (Math.random() - 0.5) * 20;
}
function updateInstances() {
for (let i = 0; i < COUNT; i++) {
dummy.position.set(positions[i*3], positions[i*3+1], positions[i*3+2]);
dummy.updateMatrix();
iMesh.setMatrixAt(i, dummy.matrix);
}
iMesh.instanceMatrix.needsUpdate = true; // ← required!
}
Only set needsUpdate = true when the data actually
changed. Unnecessary updates cause a CPU→GPU buffer upload every
frame.
TypedArrays — Eliminate GC Pauses
Regular JavaScript objects ({ x, y, z }) per particle =
many heap allocations = GC pauses at the worst moment. Use
Float32Array (or Float64Array) instead — the
memory is contiguous and never garbage collected:
// ❌ Object array — GC pressure
const particles = Array.from({ length: 10000 }, () => ({
x: Math.random(), y: Math.random(), z: Math.random(),
vx: 0, vy: 0, vz: 0,
}));
// ✅ TypedArray SoA (Structure of Arrays) — no GC, cache-friendly
const N = 10_000;
const px = new Float32Array(N), py = new Float32Array(N), pz = new Float32Array(N);
const vx = new Float32Array(N), vy = new Float32Array(N), vz = new Float32Array(N);
// Physics update — no object allocation
for (let i = 0; i < N; i++) {
vx[i] += 0; // gravity, forces...
vy[i] -= 9.8 * dt;
px[i] += vx[i] * dt;
py[i] += vy[i] * dt;
pz[i] += vz[i] * dt;
}
SoA (Structure of Arrays) is more cache-friendly than AoS (Array of Structures) because the loop processes one property of all particles at a time, which matches how CPU cache lines work.
Spatial Hashing for O(1) Neighbour Lookup
Naïve collision detection is O(N²) — every particle checks every other. Spatial hashing reduces it to ~O(N) for uniform particle distributions:
class SpatialHash {
constructor(cellSize) {
this.cellSize = cellSize;
this.table = new Map();
}
_key(x, y, z) {
const cx = Math.floor(x / this.cellSize);
const cy = Math.floor(y / this.cellSize);
const cz = Math.floor(z / this.cellSize);
return `${cx},${cy},${cz}`;
}
clear() { this.table.clear(); }
insert(i, x, y, z) {
const k = this._key(x, y, z);
if (!this.table.has(k)) this.table.set(k, []);
this.table.get(k).push(i);
}
query(x, y, z) {
// Returns indices of particles in same and adjacent cells
const result = [];
const cx = Math.floor(x / this.cellSize);
const cy = Math.floor(y / this.cellSize);
const cz = Math.floor(z / this.cellSize);
for (let dx = -1; dx <= 1; dx++)
for (let dy = -1; dy <= 1; dy++)
for (let dz = -1; dz <= 1; dz++) {
const k = `${cx+dx},${cy+dy},${cz+dz}`;
const cell = this.table.get(k);
if (cell) result.push(...cell);
}
return result;
}
}
// Usage: cell size = 2× particle radius
const hash = new SpatialHash(0.2);
// Each frame: 1) clear 2) insert all 3) query neighbours
hash.clear();
for (let i = 0; i < N; i++) hash.insert(i, px[i], py[i], pz[i]);
for (let i = 0; i < N; i++) {
const neighbours = hash.query(px[i], py[i], pz[i]);
// check collision only with neighbours (small set)
}
Fixed Timestep + Web Worker Physics
Physics should run at a fixed step (e.g. 1/120 s) independent of rendering frame rate. Offload to a Web Worker so physics doesn't block the render:
// main.js
const PHYSICS_STEP = 1 / 120;
let accumulator = 0;
// Worker for physics
const worker = new Worker('./physics-worker.js');
const posBuffer = new SharedArrayBuffer(N * 3 * 4); // Float32
const positions = new Float32Array(posBuffer);
worker.postMessage({ type: 'init', buffer: posBuffer, count: N });
// Render loop — just reads the shared buffer
function animate(t) {
requestAnimationFrame(animate);
// Only reads — no locking needed for loose sync
updateInstancesFromBuffer(positions);
renderer.render(scene, camera);
}
// physics-worker.js
self.onmessage = ({ data }) => {
if (data.type !== 'init') return;
const pos = new Float32Array(data.buffer);
const vel = new Float32Array(data.count * 3);
const dt = 1 / 120;
setInterval(() => {
for (let i = 0; i < data.count; i++) {
vel[i*3+1] -= 9.8 * dt;
pos[i*3+0] += vel[i*3+0] * dt;
pos[i*3+1] += vel[i*3+1] * dt;
pos[i*3+2] += vel[i*3+2] * dt;
if (pos[i*3+1] < 0) { pos[i*3+1] = 0; vel[i*3+1] *= -0.7; }
}
}, dt * 1000);
};
SharedArrayBuffer requires
Cross-Origin-Opener-Policy: same-origin and
Cross-Origin-Embedder-Policy: require-corp headers. For
simpler cases use postMessage with a regular
ArrayBuffer transferable (zero-copy).
Frustum Culling and LOD
Three.js does frustum culling automatically for individual
Mesh objects. For instanced meshes, culling is
per-draw-call (either all or nothing). Manual per-instance culling:
const frustum = new THREE.Frustum();
const projScreen = new THREE.Matrix4();
function cullInstances(iMesh, positions, count) {
projScreen.multiplyMatrices(camera.projectionMatrix, camera.matrixWorldInverse);
frustum.setFromProjectionMatrix(projScreen);
const sphere = new THREE.Sphere();
let visibleCount = 0;
for (let i = 0; i < count; i++) {
sphere.center.set(positions[i*3], positions[i*3+1], positions[i*3+2]);
sphere.radius = 0.1; // bounding radius
if (frustum.intersectsSphere(sphere)) {
// Copy matrix to visible slot
iMesh.getMatrixAt(i, dummy.matrix);
iMesh.setMatrixAt(visibleCount++, dummy.matrix);
}
}
iMesh.count = visibleCount; // only render visible instances
iMesh.instanceMatrix.needsUpdate = true;
}
For complex scenes, Three.js has built-in LOD (Level of
Detail) — swap to simpler geometry when far from the camera:
const lod = new THREE.LOD();
lod.addLevel(new THREE.Mesh(highPoly, mat), 0); // <10 units away
lod.addLevel(new THREE.Mesh(midPoly, mat), 10); // 10–50 units
lod.addLevel(new THREE.Mesh(lowPoly, mat), 50); // >50 units
scene.add(lod);
Things to Avoid
-
GPU readback —
renderer.readRenderTargetPixels()stalls the GPU pipeline. Avoid in render loop. -
Allocating inside the loop —
new THREE.Vector3(),new Array(), spread operators ([...arr]) all allocate. Pre-allocate and reuse. -
Calling
.getBoundingBox()every frame — it recomputes from all vertices. Cache it or set manually. - Unbounded physics sub-steps — if the frame takes 200 ms, you may run 24 sub-steps and make it worse. Cap sub-steps at 5–10.
- Dynamic shadow maps with many casters — shadow map rendering traverses ALL shadow-casting objects each frame. Use baked lightmaps for static geometry.
- Too many unique materials — each unique material = a shader program; switching programs is expensive. Batch: same material across similar objects.
-
Calling
needsUpdate = trueon static geometry — each upload re-sends the GPU buffer. Only set when data actually changed.