There is a surprising fact buried in the Web Audio API specification: the AudioContext
maintains its own monotonic clock — audioCtx.currentTime — that advances independently
of performance.now() and is not affected by garbage-collection pauses or
requestAnimationFrame jank. This makes it far more reliable than the JavaScript event
loop for anything that needs precise timing. A drum machine, a Doppler-shifted collision sound, a
resonance tone that tracks a pendulum's frequency — all become trivially accurate once you schedule
audio events against currentTime rather than firing them from inside your render loop.
The API has been available in all major browsers since around 2014, yet it remains underused in
interactive science tools. This post covers the four techniques that matter most when adding audio
to a physics simulation: creating and resuming an AudioContext, scheduling
OscillatorNode events at frame-precise times, reading frequency data from an
AnalyserNode to draw a live spectrogram, and offloading custom DSP to an
AudioWorkletProcessor so the main thread stays free for rendering.
1. AudioContext, Oscillators, and Precise Scheduling
The very first thing to understand is that browsers block audio until a user gesture occurs.
Calling new AudioContext() before a click or keydown will leave the context in
"suspended" state. The fix is to call audioCtx.resume() from inside a
user-interaction handler and only then begin scheduling nodes.
// Create once, outside the render loop
const audioCtx = new AudioContext({ latencyHint: 'interactive' });
document.addEventListener('pointerdown', () => {
if (audioCtx.state === 'suspended') audioCtx.resume();
}, { once: true });
Once the context is running, you can schedule an OscillatorNode to play at a precise
future time using the audio clock rather than a wall-clock timeout. This is the key
insight: instead of calling setTimeout(() => playTone(), delay) you say
"start this oscillator at audio-clock time T". The difference is that the audio engine
queues the event in its own real-time thread, bypassing JavaScript scheduling latency entirely.
function scheduleTone(frequency, startTime, duration) {
const osc = audioCtx.createOscillator();
const gain = audioCtx.createGain();
osc.type = 'sine';
osc.frequency.setValueAtTime(frequency, startTime);
// Amplitude envelope: quick attack, exponential decay
gain.gain.setValueAtTime(0.001, startTime);
gain.gain.exponentialRampToValueAtTime(0.6, startTime + 0.01);
gain.gain.exponentialRampToValueAtTime(0.001, startTime + duration);
osc.connect(gain);
gain.connect(audioCtx.destination);
osc.start(startTime);
osc.stop(startTime + duration + 0.05); // auto-cleanup
}
// Inside your simulation step — call this when a collision occurs:
// "play the tone 30ms from now on the audio clock"
scheduleTone(440, audioCtx.currentTime + 0.03, 0.3);
For a pendulum simulation you might map the instantaneous angular velocity
ω to pitch using the relationship
f = f₀ · (1 + k·|ω|), where f₀ is the natural
frequency of the pendulum (f₀ = (1/2π)·√(g/L)) and k
is a perceptual scaling constant. This gives a tone that rises as the pendulum swings
fastest through the bottom of its arc and fades as it slows near the top.
Rule: Never schedule audio by calling play() from inside
requestAnimationFrame. The rAF callback can be delayed by 5–20 ms on a busy
main thread. Instead, look ahead by a fixed amount — typically 100–200 ms — and schedule
all events that fall within that window. This is the lookahead-scheduler pattern used by
every professional Web Audio sequencer.
Lookahead Scheduler Pattern
const LOOKAHEAD = 0.15; // schedule 150 ms ahead
const SCHEDULE_INTERVAL = 25; // check every 25 ms
let nextNoteTime = audioCtx.currentTime;
function scheduler() {
while (nextNoteTime < audioCtx.currentTime + LOOKAHEAD) {
scheduleNote(nextNoteTime);
nextNoteTime += noteInterval; // advance by physics-derived interval
}
setTimeout(scheduler, SCHEDULE_INTERVAL);
}
scheduler();
Note that the scheduler uses setTimeout deliberately — not requestAnimationFrame.
The audio scheduler does not need to be tied to display refresh; it just needs to wake up often enough
to keep the lookahead buffer full. Using rAF here would cause audio gaps whenever the tab is hidden.
2. AnalyserNode for Real-Time Spectrogram Visualisation
An AnalyserNode sits in the audio graph and exposes frequency-domain data computed via
a short-time Fourier transform (STFT). You choose the FFT size — a power of two from 32 to 32768 —
which determines the trade-off between frequency resolution and time resolution. A larger FFT gives
finer frequency bins but averages over a longer time window.
The frequency resolution per bin is Δf = sampleRate / fftSize. With the default
sample rate of 44100 Hz and an FFT size of 2048, each bin is approximately
44100 / 2048 ≈ 21.5 Hz wide. The frequencyBinCount property equals
fftSize / 2, giving you 1024 usable bins from 0 Hz up to the Nyquist frequency.
const analyser = audioCtx.createAnalyser();
analyser.fftSize = 2048;
analyser.smoothingTimeConstant = 0.8; // 0 = no smoothing, 1 = maximum
// Wire: source → analyser → destination
sourceNode.connect(analyser);
analyser.connect(audioCtx.destination);
// In your rAF loop:
const dataArray = new Uint8Array(analyser.frequencyBinCount);
function drawSpectrogram(canvas) {
const ctx = canvas.getContext('2d');
analyser.getByteFrequencyData(dataArray); // fills dataArray in-place
const barWidth = canvas.width / dataArray.length;
ctx.clearRect(0, 0, canvas.width, canvas.height);
dataArray.forEach((value, i) => {
const barHeight = (value / 255) * canvas.height;
const hue = (i / dataArray.length) * 280; // violet → red sweep
ctx.fillStyle = `hsl(${hue}, 80%, 50%)`;
ctx.fillRect(i * barWidth, canvas.height - barHeight, barWidth, barHeight);
});
}
function renderLoop() {
drawSpectrogram(spectrogramCanvas);
requestAnimationFrame(renderLoop);
}
renderLoop();
For a rolling waterfall spectrogram — the classic display used in seismology and radio monitoring — draw each new frame as a single-pixel-wide column onto an off-screen canvas, then blit the whole off-screen canvas shifted one pixel to the left. This accumulates a time history without storing the full array.
3. AudioWorkletProcessor for Custom DSP
The ScriptProcessorNode API — which used to be the only way to write custom DSP in
the browser — is deprecated because it ran on the main thread and caused audio dropouts whenever
the page was doing rendering work. Its replacement, AudioWorkletProcessor, runs in a
dedicated real-time audio rendering thread and communicates with the main thread via
MessagePort and SharedArrayBuffer.
// physics-processor.js (loaded as a Worklet module)
class PhysicsProcessor extends AudioWorkletProcessor {
constructor(options) {
super();
this._mass = options.processorOptions.mass ?? 1.0;
this._stiffness = options.processorOptions.stiffness ?? 100.0;
this._damping = options.processorOptions.damping ?? 0.5;
this._x = 0; // displacement
this._v = 0; // velocity
}
process(inputs, outputs) {
const output = outputs[0][0];
const dt = 1 / sampleRate; // per-sample timestep
for (let i = 0; i < output.length; i++) {
// Damped harmonic oscillator: mẍ + cẋ + kx = 0
const accel = (-this._stiffness * this._x - this._damping * this._v) / this._mass;
this._v += accel * dt;
this._x += this._v * dt;
output[i] = this._x; // displacement → audio sample
}
return true; // keep processor alive
}
}
registerProcessor('physics-processor', PhysicsProcessor);
// Main thread — register and instantiate
await audioCtx.audioWorklet.addModule('/audio/physics-processor.js');
const physicsNode = new AudioWorkletNode(audioCtx, 'physics-processor', {
processorOptions: { mass: 1.0, stiffness: 440 ** 2, damping: 2.0 }
});
physicsNode.connect(audioCtx.destination);
The damped harmonic oscillator above produces a decaying sinusoid at the natural frequency
ω₀ = √(k/m) radians per second, which corresponds to f₀ = ω₀ / 2π Hz.
With k = 440² ≈ 193,600 and m = 1, the tone is exactly concert A.
Adjust damping via a AudioWorkletNode.port.postMessage call to simulate different
materials — a tight spring, a loose pendulum, a bouncing rubber ball.
Gotcha: audioCtx.audioWorklet.addModule() requires the file to be
served over HTTPS or localhost. It also cannot import ES modules via bare specifiers —
use full relative URLs. If you see DOMException: The worklet cannot be added, check
your Content Security Policy: you need script-src to allow 'self'.
4. Syncing Animation to the Audio Clock
The hardest part of combining audio and visuals is keeping them in sync. The audio clock and the
display clock drift apart because they run in separate hardware domains. The canonical solution is
to treat the audio clock as the source of truth and compute the visual state as a function of
audioCtx.currentTime — not as an accumulated sum of rAF deltas.
// Instead of accumulating dt:
// t += dt; // drifts over time
// Derive simulation time directly from the audio clock:
function renderLoop() {
const audioTime = audioCtx.currentTime;
// Pendulum angle as a function of audio clock time
const omega0 = Math.sqrt(9.81 / pendulumLength);
const angle = initialAngle * Math.cos(omega0 * audioTime) * Math.exp(-damping * audioTime);
drawPendulum(angle);
requestAnimationFrame(renderLoop);
}
For simulations where the audio is synthesised from the physics state (rather than the physics
being driven by audio time), you can instead log a pair
(audioCtx.currentTime, simulationState) at each physics step and use linear
interpolation in the render loop to find the visual state that corresponds to the current
audio clock value. This tolerates a variable-rate physics loop without accumulating error.
One practical detail: if you are using a GainNode to fade audio in and out, always
use the setValueAtTime / linearRampToValueAtTime family of methods
rather than setting gain.gain.value directly. Direct value assignment causes a
discontinuity (a click) because it bypasses the parameter timeline. The scheduled ramp is
rendered sample-by-sample and produces a smooth, click-free transition.
Try It Yourself
These simulations on mysimulator.uk are good starting points for experimenting with audio coupling. Each one exposes oscillatory or wave-like behaviour that maps naturally onto the techniques above:
Closing Thought
The Web Audio API is, at its core, a directed acyclic graph of signal-processing nodes — not entirely unlike the scene graphs you already use in Three.js or Babylon.js. Once you start thinking of audio as a stream of floating-point samples flowing through a graph, the connections to physics become obvious: displacement is a sample value, frequency is the number of oscillations per second, damping is a pole on the complex frequency plane. The mathematics is identical; only the output medium differs.
The most elegant simulations are the ones where the audio is not a sound effect added on top but an honest projection of the same underlying equations onto the perceptual channel of hearing. Get the physics right and the sound follows for free.