Info & Theory
Floyd-Warshall finds the shortest path between
every pair of vertices in a weighted graph by filling a
V×V distance matrix, where dist[i][j]
is the cost of the best route from i to
j.
Initialisation
Set dist[i][i] = 0, set
dist[i][j] to the weight of the edge
i → j if it exists, and to
∞ otherwise.
The triple loop
For each pivot k (outer loop), then every source
i and target j, relax:
dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])- If improved, set
next[i][j] = next[i][k].
Why it works (dynamic programming)
After pivot k, dist[i][j] is the
shortest path that may use only intermediate vertices
1..k. Because k is the outermost loop,
dist[i][k] and dist[k][j] are already
optimal sub-paths, so the recurrence is always built on optimal
sub-solutions.
Complexity
Three nested loops over V vertices give
O(V³) time and O(V²) space. With tiny
constants it is simple to code and competitive on small or dense
graphs.
Negative edges & cycles
Unlike Dijkstra, Floyd-Warshall accepts negative edges.
A negative cycle makes some paths undefined; you can
detect one if any dist[i][i] turns negative. This
simulation uses only non-negative weights so every distance is
well defined.
Frequently asked questions
What does the Floyd-Warshall algorithm compute?
Floyd-Warshall computes the shortest path between every pair of vertices in a weighted graph at once. Instead of running a single-source algorithm from each node, it fills a V×V distance matrix where dist[i][j] is the length of the shortest path from i to j. It works on directed or undirected graphs and handles negative edge weights as long as there is no negative cycle.
How does the pivot relaxation work?
The algorithm uses a triple loop with an outer pivot k. For every pair (i, j) it asks whether going from i to k and then from k to j is shorter than the current best: dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j]). After the pivot has cycled through all vertices, every shortest path that uses only intermediate vertices from the full set has been found.
Why is Floyd-Warshall O(V³)?
There are three nested loops, each running over all V vertices: the pivot k, the source i, and the target j. That gives V × V × V = V³ relaxation steps, so the time complexity is O(V³). The space complexity is O(V²) for the distance matrix, plus another O(V²) if you keep a predecessor matrix for path reconstruction.
Why is the dynamic-programming order correct?
After iteration k, dist[i][j] holds the shortest path from i to j that is allowed to pass only through intermediate vertices numbered 1..k. Because the pivot k is the outermost loop, by the time we use dist[i][k] and dist[k][j] they already account for all earlier pivots, so the recurrence is always built on optimal sub-solutions. This is the classic dynamic-programming invariant that makes the algorithm correct.
How does Floyd-Warshall handle negative edges and negative cycles?
Unlike Dijkstra, Floyd-Warshall tolerates negative edge weights and still returns correct shortest paths. A negative cycle, however, means some shortest paths are undefined because you can loop forever to lower the cost. You can detect a negative cycle after the run: if any diagonal entry dist[i][i] becomes negative, vertex i lies on a negative cycle. This simulation uses only non-negative weights to keep results well defined.
How is the shortest path reconstructed?
During initialization a next-hop matrix next[i][j] is set to j whenever a direct edge exists. Whenever a relaxation through pivot k improves dist[i][j], we copy next[i][j] = next[i][k], recording that the best route now starts by heading toward k. To rebuild the path you follow next[i][j] from the source until you reach the target, collecting the vertices along the way.
How does Floyd-Warshall compare to Dijkstra?
Dijkstra solves the single-source shortest-path problem and is fast on sparse graphs, but it requires non-negative weights and gives paths from one source only. Running Dijkstra from every vertex costs about O(V·E·log V) with a heap, which beats Floyd-Warshall on large sparse graphs. Floyd-Warshall, with its flat O(V³) cost and tiny constant factors, is simpler to code and often wins on small or dense graphs and when you genuinely need all pairs.
How does it compare to Bellman-Ford?
Bellman-Ford is also single-source but, like Floyd-Warshall, accepts negative edges and can detect negative cycles. It runs in O(V·E) per source. If you need shortest paths from one origin in a graph with negative edges, Bellman-Ford is the natural choice; if you need every pair at once, Floyd-Warshall's three compact loops are usually easier and competitive on dense graphs.
What does ∞ (INF) mean in the distance matrix?
An entry of ∞ means there is currently no known path between those two vertices. The diagonal dist[i][i] starts at 0 because the distance from a vertex to itself is zero. As the pivot advances, ∞ entries can become finite when an intermediate vertex links two previously disconnected nodes, which is exactly the matrix 'tightening' you see in the animation.
Where is Floyd-Warshall used in practice?
It is used for routing tables in small networks, computing the transitive closure of a relation, finding graph diameter and centrality, and solving all-pairs distance queries in maps, games, and operations research. A boolean variant computes reachability, and a max-min variant solves widest-path or bottleneck problems by swapping the min/plus operations for max/min.