Question 1

What does this simulation actually show?

Accepted Answer

It shows a tiny neural network learning to classify two-dimensional points in real time. On the left you see the network diagram with nodes that brighten with activation and edges coloured blue for positive weights and red for negative ones. On the right, the decision boundary morphs each epoch as training reshapes the weights.

Question 2

How does backpropagation work here?

Accepted Answer

After a forward pass produces an output, the algorithm computes the error at the output, then propagates that error backwards through the layers using the chain rule of calculus. This yields the gradient of the loss with respect to each weight and bias. Gradient descent then subtracts a fraction of each gradient, scaled by the learning rate, from the corresponding weight.

Question 3

Why is XOR a famous test case?

Accepted Answer

XOR is not linearly separable, so a single-layer perceptron cannot solve it. It requires at least one hidden layer to bend the decision surface. Because the simulation includes a hidden layer with a non-linear sigmoid activation, it can learn XOR, demonstrating exactly why depth and non-linearity matter in neural networks.

Question 4

What do the controls do?

Accepted Answer

Task chooses the dataset (XOR, circle or spiral). Hidden neurons sets how wide the hidden layer is, from 2 to 16 units. Learning rate scales the update step from 0.01 to 0.20. Speed sets how many epochs run per animation frame. Train, Step and Reset start continuous training, advance one epoch, or reinitialise the weights respectively.

Question 5

What loss function and activation does it use?

Accepted Answer

Each hidden and output neuron uses the sigmoid activation, which squashes values into the range 0 to 1. Training minimises the mean squared error between predictions and targets, reported in the panel as Loss (MSE). The output neuron always uses sigmoid so its value can be read directly as a probability-like score between 0 and 1.

Question 6

How are the weights initialised?

Accepted Answer

Weights are drawn from a small random range scaled by the square root of two over the layer fan-in, a He-style initialisation that keeps early activations from saturating. Biases start at zero. Pressing Reset reinitialises all weights, which is why the same dataset can converge along a different path each time.

Question 7

What does the decision boundary panel mean?

Accepted Answer

The right panel colours a grid of points by the network's output, blending towards one colour for class 0 and another for class 1. Training points are drawn on top as circles. As the network learns, the coloured regions sharpen and curve to separate the two classes. Clicking inside the panel feeds that point through the network and prints its prediction.

Question 8

Is this an accurate model of real neural networks?

Accepted Answer

The mathematics is genuine: real forward propagation, real chain-rule backpropagation and real gradient descent, identical in principle to large networks. It is simplified in scale, using one small hidden layer, stochastic per-example updates and a fixed sigmoid rather than modern techniques like ReLU, mini-batches, momentum or Adam used in production systems.

Question 9

What happens if I set the learning rate too high or too low?

Accepted Answer

Too high and the updates overshoot the minimum, so the loss oscillates or diverges instead of settling. Too low and training crawls, needing many more epochs to reach low error. The slider spans 0.01 to 0.20, so it is instructive to compare how the loss curve behaves at the extremes versus a moderate value.

Question 10

Where is backpropagation used in the real world?

Accepted Answer

Backpropagation with gradient descent is the training engine behind almost every modern deep-learning system, including image classifiers, speech recognition, recommendation engines and large language models. The principles you watch here, scaled up to billions of weights and richer architectures, are what allow those systems to learn patterns directly from data.

🧠 Neural Net

About the Neural Network Backpropagation Visualiser

Frequently Asked Questions

🧠 Neural Network — Backpropagation

🔬 What It Demonstrates

🎮 How to Use

💡 Did You Know?