A two-layer neural network trains live on XOR, circle or spiral classification boundaries. Watch weights change as edges, activations glow as nodes, and the decision boundary evolves in real time.
Forward pass computes activations through layers. Backpropagation computes gradients of the loss with respect to each weight. Gradient descent updates weights to reduce error.
Pick a dataset (XOR, circle, spiral). Watch training progress — the decision surface morphs to classify points correctly. Adjust learning rate and hidden neurons.
Backpropagation was popularised by Rumelhart, Hinton and Williams in 1986, but the core idea (reverse-mode automatic differentiation) dates to Seppo Linnainmaa's 1970 master's thesis.