Question 1

What does a load balancer do?

Accepted Answer

A load balancer sits in front of a pool of backend servers and decides which server should handle each incoming request, spreading traffic so no single server is overwhelmed while others sit idle.

Question 2

How does round-robin balancing work?

Accepted Answer

Round-robin sends each successive request to the next server in a fixed cyclic order: server 1, server 2, server 3, then back to 1. It is simple and fair when servers are identical and requests cost roughly the same.

Question 3

What is least-connections balancing?

Accepted Answer

Least-connections sends each new request to the server currently handling the fewest active requests. It adapts to uneven request durations and unequal server speeds far better than round-robin.

Question 4

What is weighted load balancing?

Accepted Answer

Weighted balancing assigns each server a weight proportional to its capacity, so faster servers receive proportionally more requests. A server with weight 3 gets roughly three times the share of one with weight 1.

Question 5

When is random load balancing acceptable?

Accepted Answer

Picking a server uniformly at random is stateless and trivial to implement, and with many servers it approaches even distribution. The 'power of two random choices' variant — pick two, take the less loaded — performs remarkably well.

Question 6

What are Poisson arrivals?

Accepted Answer

Poisson arrivals model independent random request times where the gaps between arrivals follow an exponential distribution. This simulation generates them with inter-arrival time −ln(U)/λ for a random U, matching real bursty traffic.

Question 7

Why does latency rise sharply near full utilisation?

Accepted Answer

Queueing theory shows waiting time grows like 1/(1 − ρ), where ρ is utilisation. As a server approaches 100% busy, even a small traffic increase causes queues and latency to explode.

Question 8

What is server utilisation?

Accepted Answer

Utilisation ρ is the fraction of time a server is busy, equal to arrival rate divided by service rate for that server. Above ρ = 1 the server cannot keep up and its queue grows without bound.

Question 9

Why can round-robin still cause imbalance?

Accepted Answer

Round-robin ignores how long each request takes and how fast each server is. If requests vary in cost or servers differ in speed, some servers build up queues while others drain, even though counts are equal.

Question 10

How is average latency measured here?

Accepted Answer

The simulation records each completed request's total time — waiting in the queue plus being served — and reports the running average across all finished requests under the current policy.

The policies

Poisson arrivals

Utilisation and latency

Frequently asked questions