★★☆ Medium

📊 PCA & SVD Visualiser

Generate a 2D point cloud, drag individual points, or adjust the covariance sliders. The simulation computes the covariance matrix, finds its eigenvectors in real time, and draws the principal component axes scaled by standard deviation. A bar chart shows explained variance per component.

Dataset

N points 80

σ_x spread 2.0

σ_y spread 0.6

Rotation θ 35°

Noise 0.2

Display

Show k components 2

Show projections 2σ ellipse

Explained Variance

PC1

80%

PC2

20%

Covariance Matrix

⎡ ⎤
⎣ ⎦

λ₁ (PC1)—

λ₂ (PC2)—

Condition #—

C = XᵀX / (n−1)
C v = λ v (eigen)
EVR = λₖ / Σλⱼ
Click canvas to add pts

About PCA & SVD

Principal Component Analysis

PCA finds the orthogonal directions of maximum variance in a dataset. Given n data points in d dimensions (zero-mean), the covariance matrix C = XᵀX/(n−1) is real symmetric and diagonalisable: C = Q Λ Qᵀ, where the columns of Q are eigenvectors (principal components) and Λ = diag(λ₁, …, λ_d) with λ₁ ≥ λ₂ ≥ … Project onto the first k columns of Q to reduce dimensionality while retaining the most variance.

Singular Value Decomposition

Any m×n matrix M = U Σ Vᵀ, where U (m×m) and V (n×n) are orthogonal and Σ is diagonal with non-negative entries σ₁ ≥ σ₂ ≥ … (singular values). PCA of X is equivalent to SVD of X/√(n−1): the principal components are the columns of V and the singular values relate to eigenvalues by σₖ = √((n−1) λₖ). SVD is numerically more stable than eigendecomposing C directly.

Geometric Interpretation

In 2D, PCA finds the major axis of the data ellipse (PC1, pointing along maximum spread) and the minor axis (PC2, perpendicular). The ratio σ₁/σ₂ equals the ratio of semi-axes. The explained variance ratio EVR₁ = λ₁/(λ₁+λ₂) is the fraction of total spread captured by PC1. For a sphere of uncorrelated data EVR₁ = 50%; for a highly elongated ellipse EVR₁ → 100%.

Applications

PCA and SVD are workhorses of data science and physics: image compression (truncated SVD), face recognition (eigenfaces), genomics (population structure), finance (factor models), spectroscopy (multivariate curve resolution), and dimensionality reduction before clustering. In quantum mechanics, the reduced density matrix for a bipartite system is diagonalised by the Schmidt decomposition — the quantum analogue of SVD.