Linear Algebra for 3D Graphics: Vectors, Matrices, Quaternions

3D Simulations

📐 Mathematics · Computer Graphics

📅 April 2026⏱ 16 min🟡 Beginner–Intermediate · Last updated: 22 June 2026

Every triangle rendered on screen has passed through at least four matrix multiplications before it reaches your eyes. The Model matrix places it in the world, the View matrix moves the world in front of the camera, the Projection matrix applies perspective, and the Viewport matrix maps to pixel coordinates. Understanding the linear algebra behind these transforms is the foundation of all 3D programming — from shader code to physics engines to machine learning with 3D data.

Written by MySimulator Team · Reviewed by MySimulator Editorial Review

1. Vectors: Points, Directions, and Operations

A vector in 3D graphics is a tuple (x, y, z) that represents either a point in space or a direction with magnitude. The distinction matters: translating a direction does not change the direction, but translating a point changes its position.

Essential Operations

// Dot product: scalar result a \cdot b = aₓbₓ + a_y b_y + a_z b_z = |a||b|cosθ Uses: - Angle between vectors: θ = arccos(a\cdotb / |a||b|) - Project a onto b: proj = (a\cdotb̂)\cdotb̂ where b̂ = b/|b| - Diffuse lighting: N\cdotL = cos(angle to light) for Lambert shading - Backface culling: if (N\cdotV < 0) skip triangle // Cross product: vector result (perpendicular to both inputs) a \times b = (a_y b_z - a_z b_y, a_z bₓ - aₓ b_z, aₓ b_y - a_y bₓ) |a \times b| = |a||b|sinθ Uses: - Triangle normal: N = (B-A) \times (C-A), then normalize - Orthonormal basis construction (e.g. camera frame) - Torque: τ = r \times F in physics simulation - Handedness check: (a\timesb)\cdotc determines coordinate system orientation // Normalization: make unit length â = a / |a| where |a| = sqrt(aₓ² + a_y² + a_z²) Note: normalizing a zero vector is undefined (NaN guard needed)

Convention clash: OpenGL and WebGL use column vectors with right-multiplication (M·v). DirectX and HLSL use row vectors with left-multiplication (v·M). The same geometric transform requires the transpose of the matrix between the two conventions. GLM (C++) uses OpenGL convention; glMatrix (JavaScript) also uses column vectors with column-major storage (same array layout as OpenGL).

2. Matrices and Transform Types

A 3×3 matrix encodes any linear transform: rotation, scale, shear. A 4×4 matrix encodes any affine transform in 3D: linear transforms plus translation (via homogeneous coordinates). GPU shaders and all 3D APIs work with 4×4 matrices.

Rotation Matrices (3×3)

// Rotation about X axis by angle θ Rₓ(θ) = | 1 0 0 | | 0 cosθ -sinθ | | 0 sinθ cosθ | // Rotation about Y axis by angle θ R_y(θ) = | cosθ 0 sinθ | | 0 1 0 | | -sinθ 0 cosθ | // Rotation about Z axis by angle θ R_z(θ) = | cosθ -sinθ 0 | | sinθ cosθ 0 | | 0 0 1 | // Rotation about arbitrary unit axis â by angle θ (Rodrigues) R = cosθ\cdotI + (1-cosθ)\cdotâaᵀ + sinθ\cdot[â]\times where [â]\times is the skew-symmetric cross product matrix of â

Scale & Translation

// Non-uniform scale (4\times4 homogeneous) S(sₓ, s_y, s_z) = | sₓ 0 0 0 | | 0 s_y 0 0 | | 0 0 s_z 0 | | 0 0 0 1 | // Translation (4\times4 homogeneous) T(tₓ, t_y, t_z) = | 1 0 0 tₓ | | 0 1 0 t_y | | 0 0 1 t_z | | 0 0 0 1 |

Combining Transforms: Order Matters

Matrix multiplication is associative but not commutative: Rotate then Translate ≠ Translate then Rotate. The convention is that the rightmost matrix is applied first to the column vector:

// Apply: first scale, then rotate, then translate M = T \cdot R \cdot S v_world = M \cdot v_local = T \cdot (R \cdot (S \cdot v_local)) ↑ last applied ↑ second ↑ first applied Mistake: M = S \cdot R \cdot T would translate first, then rotate, placing the object in a completely different location.

3. Homogeneous Coordinates

3D translation cannot be represented as a 3×3 matrix multiplication — translation is not a linear transform (it does not preserve the origin). The elegant solution: lift 3D coordinates to 4D by adding a fourth component w.

// 3D point \to homogeneous: (x, y, z) \to (x, y, z, 1) // 3D vector/direction: (x, y, z) \to (x, y, z, 0) The w component distinguishes points (w=1) from directions (w=0): T \cdot (x, y, z, 1)ᵀ = (x+tₓ, y+t_y, z+t_z, 1) ✓ translated T \cdot (x, y, z, 0)ᵀ = (x, y, z, 0) ✓ direction unchanged! // Converting back to 3D (perspective divide): (X, Y, Z, W) \to (X/W, Y/W, Z/W) for W \neq 0 After the perspective projection matrix, W \neq 1 in general. The GPU performs the perspective divide automatically at the end of the vertex shader stage (clip space \to NDC).

Homogeneous point at infinity: (x, y, z, 0) represents the direction (x, y, z) as a "point at infinity" — the limit of (tx, ty, tz, 1) as t→∞. Parallel lines in projective geometry meet at a point at infinity, which is why railroad tracks converge at the vanishing point in a perspective rendering.

4. The MVP Transform Pipeline

A vertex travels through three coordinate spaces before becoming a pixel. Each space is connected by a matrix multiply:

// Full vertex transform chain v_clip = P \cdot V \cdot M \cdot v_local M = Model matrix (local space \to world space) V = View matrix (world space \to camera/eye space) P = Projection matrix (eye space \to clip space) // Model matrix: position + orient + scale the object M = T_world \cdot R_world \cdot S_local // View matrix: inverse of the camera's world transform // Camera is at pos, looking at target, up vector is up: V = lookAt(pos, target, up) = inverse(T_pos \cdot R_camera) // Perspective projection matrix (OpenGL convention, NDC: [-1,1]) P = | 2n/(r-l) 0 (r+l)/(r-l) 0 | | 0 2n/(t-b) (t+b)/(t-b) 0 | | 0 0 -(f+n)/(f-n) -2fn/(f-n)| | 0 0 -1 0 | Simplified for symmetric frustum (l=-r, b=-t): fovY = vertical FOV in radians aspect = width/height P[0][0] = 1/(aspect\cdottan(fovY/2)) P[1][1] = 1/tan(fovY/2) P[2][2] = -(f+n)/(f-n) P[2][3] = -2fn/(f-n) P[3][2] = -1

After P·V·M·v, the result is in clip space. The GPU clips triangles to the clip cube, then performs the perspective divide (÷w) to reach NDC (Normalized Device Coordinates, x∈[−1,1], y∈[−1,1], z∈[−1,1] for OpenGL). Finally, the viewport transform maps NDC to pixel coordinates:

// Viewport transform (NDC \to window pixels) x_px = (x_ndc + 1) / 2 \cdot viewport_width + x_origin y_px = (y_ndc + 1) / 2 \cdot viewport_height + y_origin // Note: y may be flipped depending on API (D3D flips, OpenGL doesn't)

5. Quaternions for Rotation

Euler angles (yaw, pitch, roll) are intuitive but suffer from gimbal lock: when two rotation axes become aligned, a degree of freedom is lost and smooth interpolation breaks. Rotation matrices work but use 9 floats for only 3 degrees of freedom and accumulate floating-point drift over many composing operations. Quaternions solve both problems using just 4 floats.

A quaternion q = (w, x, y, z) = w + xi + yj + zk, where i, j, k are imaginary units satisfying i² = j² = k² = ijk = −1. A unit quaternion (|q|=1) represents a 3D rotation:

// Quaternion encoding rotation by angle θ about unit axis â=(ax,ay,az): q = (cos(θ/2), ax\cdotsin(θ/2), ay\cdotsin(θ/2), az\cdotsin(θ/2)) = (w, x, y, z) // Applying rotation to vector v (as pure quaternion v=(0,v)): v_rotated = q \cdot v \cdot q* where q* = (w, -x, -y, -z) (conjugate = inverse for unit q) // Equivalent matrix form (row-major used here for display): R = | 1-2(y²+z²) 2(xy-wz) 2(xz+wy) | | 2(xy+wz) 1-2(x²+z²) 2(yz-wx) | | 2(xz-wy) 2(yz+wx) 1-2(x²+y²)| // Composing two rotations: quaternion multiplication q_total = q₂ \cdot q₁ (apply q₁ first, then q₂) w = w₂w₁ - x₂x₁ - y₂y₁ - z₂z₁ x = w₂x₁ + x₂w₁ + y₂z₁ - z₂y₁ y = w₂y₁ - x₂z₁ + y₂w₁ + z₂x₁ z = w₂z₁ + x₂y₁ - y₂x₁ + z₂w₁

6. Interpolation and Smooth Rotation

For animation, we need smooth transitions between orientations. Linearly interpolating matrix entries or Euler angles gives non-constant angular velocity and poor results. Quaternion interpolation is the standard solution.

LERP (Linear Interpolation)

// Naïve quaternion lerp — fast but not constant speed q(t) = normalize((1-t)\cdotq₁ + t\cdotq₂) for t \in [0, 1] Use when: speed difference is imperceptible (t range small), or performance is critical (LERP is one add + normalize)

SLERP (Spherical Linear Interpolation)

// Slerp — constant angular velocity along great circle on unit 4-sphere Ω = arccos(q₁ \cdot q₂) (dot product of quaternion components) q(t) = [sin((1-t)Ω) / sinΩ]\cdotq₁ + [sin(tΩ) / sinΩ]\cdotq₂ Edge cases: - If q₁\cdotq₂ < 0: negate q₂ before slerp (short path vs long path) - If Ω \approx 0: fall back to LERP (quaternions nearly identical) Slerp is the gold standard for animation blending — it preserves constant angular velocity and shortest path.

Gimbal lock explained: With Euler angles (e.g. Z·Y·X order), if the Y rotation is ±90°, the X and Z axes become parallel — they both rotate around the same world axis. One degree of freedom is lost. No combination of X and Z rotation can then produce rotation around the lost axis. Quaternions have no such degeneracy because they operate directly on the 4D unit sphere.

Squad and Higher-Order Splines

For smooth paths through multiple keyframe orientations, SLERP alone creates C¹ discontinuities at keyframes (velocity is continuous but angular acceleration is not). Squad (Spherical Cubic Interpolation) extends Bézier curves to quaternion space, producing C² smooth rotation paths used in game camera systems and cinematic animation.

7. Normal Transforms and the Normal Matrix

Surface normals are direction vectors and transform differently from points under non-uniform scaling. If you transform a normal by the same matrix M as the vertices, non-uniform scale will distort the normals and break lighting calculations.

// Wrong: N_world = M \cdot N_local (breaks for non-uniform scale) // Correct: N_world = transpose(inverse(M)) \cdot N_local The "normal matrix" = transpose(inverse(M)) Often precomputed on the CPU and passed as a uniform: uniform mat3 normalMatrix; // in GLSL For orthogonal matrices (rotation only, no scale): inverse(M) = transpose(M) \to normal matrix = M itself (so pure rotation transforms normals correctly with the same matrix) Proof sketch: If t is a tangent (Mv is a transformed point), normal N must stay perpendicular: N\cdott = 0. After transform: (AN)\cdot(Mt) = 0 \to (AN)ᵀ(Mt) = 0 \to Nᵀ(AᵀM)t = 0 \to AᵀM = I \to A = (M⁻¹)ᵀ = transpose(inverse(M)) ✓

8. Linear Algebra on the GPU: GLSL & WGSL

GPU shader languages have first-class linear algebra types. The GPU executes vector and matrix operations in single instructions across all lanes of a SIMD processor — what takes a loop in CPU code is a single op on the GPU.

GLSL (WebGL / OpenGL)

// GLSL built-in types vec2, vec3, vec4 // float vectors mat2, mat3, mat4 // float matrices (column-major) ivec3, uvec3, bvec3 // integer, unsigned, bool vectors // Common operations (all built-in, hardware-accelerated) float d = dot(a, b); // dot product vec3 n = cross(a, b); // cross product vec3 u = normalize(v); // normalise (unit length) float l = length(v); // Euclidean length vec3 r = reflect(d, n); // reflection vector vec3 rf = refract(d, n, eta); // refraction vector float f = mix(a, b, t); // linear interpolation (LERP) vec3 c = clamp(v, 0.0, 1.0); // component-wise clamp // Matrix-vector multiplication (column vector post-multiplication) vec4 v_clip = projMatrix * viewMatrix * modelMatrix * v_local; // Accessing columns (mat4 is array of 4 column vec4s) vec4 col0 = myMat[0]; // first column float m23 = myMat[2][3];// row 3, col 2 (column-major: col 2, row 3)

WGSL (WebGPU)

// WGSL (WebGPU Shading Language) — similar but stricter var v: vec3f = vec3f(1.0, 0.0, 0.0); var m: mat4x4f = /* ... */; let d: f32 = dot(a, b); let n: vec3f = normalize(cross(b - a, c - a)); // Matrix multiply: same notation let clip_pos: vec4f = uniforms.mvp * vertex_pos; // Key difference from GLSL: explicit types everywhere, // no implicit conversions (1 instead of 1.0 is a compile error)

Performance note: On modern GPUs, a mat4×vec4 multiply is a single hardware instruction executing 16 multiply-accumulate operations simultaneously. Transforming a batch of one million vertices takes a few milliseconds on a mid-range GPU — the bottleneck is usually memory bandwidth (fetching the vertex data), not the arithmetic.

Practical Tips for 3D Programming

Always normalise after SLERP/quaternion operations — floating-point drift accumulates and eventually makes |q| ≠ 1, causing subtle scaling artifacts in rotation matrices.
Use glMatrix (JavaScript) or GLM (C++) for production work — hand-coded matrix math has subtle column/row major bugs that cost hours to debug.
Watch the multiplication order carefully — glMatrix functions like mat4.multiply(out, a, b) compute a·b, meaning b is applied first. Many bugs come from reversing this order.
Avoid inverting matrices on the GPU — single-precision float inverse of a nearly-singular matrix loses accuracy fast. Compute inverses on the CPU and upload as uniforms.
Depth precision — the perspective projection compresses depth non-linearly, putting most precision near the near plane. Use a logarithmic depth buffer or reversed-Z (near=1, far=0) for large-scale scenes.

Sources

Matrix (mathematics) – Wikipedia