4. Correctness Proof
- State the correctness goal precisely
- Introduce the core invariant
- Prove BFS never underestimates distances
- Use FIFO to justify layered exploration
- Prove first discovery gives a shortest path
- Conclude BFS computes shortest-path distances
- Show the predecessor pointers form a shortest-path tree
- Revisit time complexity
- Extract the structural insight
This part formalizes what we claimed in Part 1: BFS computes shortest-path distances in unweighted graphs.
The proof is structural. The queue is not an implementation detail: it is the reason the argument works.
4.1 What Must Be Proven
Let G = (V, E) be a graph and s ∈ V the source. Define δ(s, v) as the true shortest-path distance (number of edges) from s to v.
We must prove that BFS computes:
for every vertex v reachable from s.
4.2 Key Invariant
The proof relies on one core invariant:
Intuitively: the first time we reach v, we reached it via the shortest layer.
We will justify this using the queue’s FIFO discipline (layered exploration).
4.3 Lemma 1 — Distances Never Underestimate
Claim: For every vertex v, BFS never underestimates the true distance:
Proof sketch
BFS only assigns d[v] via an edge (u, v) ∈ E:
Any path to v through u has length at least δ(s, u) + 1, so assigning d[u] + 1 cannot be smaller than the true optimum. Therefore BFS does not produce distances smaller than reality.
4.4 Lemma 2 — FIFO Ensures Layered Exploration
The queue guarantees that vertices are processed in non-decreasing distance order:
Why? When a vertex u is dequeued, every newly discovered neighbor v is assigned d[v] = d[u] + 1 and appended to the end of the queue. So the queue forms consecutive layers.
Layer L0: { s } d = 0
|
Layer L1: { neighbors of s } d = 1
|
Layer L2: { neighbors of L1 } d = 2
|
Layer L3: { neighbors of L2 } d = 3
...
4.5 Lemma 3 — First Discovery Is Optimal
Claim: When v is first discovered, BFS assigns the true shortest distance:
Proof (by contradiction)
Suppose BFS first discovers v with d[v] > δ(s, v). Consider a shortest path:
Then vk-1 satisfies δ(s, vk-1) = δ(s, v) − 1, so it lies in an earlier layer. By Lemma 2, it must be dequeued before any vertex in layer δ(s, v) is discovered.
When BFS processes vk-1, it examines edge (vk-1, v) and would assign:
Contradiction. Therefore first discovery assigns the optimal distance.
4.6 Theorem — BFS Computes Shortest Paths
From Lemma 3, every reachable vertex v receives d[v] = δ(s, v) at first discovery.
4.7 The BFS Tree Is a Shortest-Path Tree
Every vertex v ≠ s receives exactly one predecessor π[v]. Moreover, when π[v] is set, BFS enforces:
Therefore, following predecessor pointers from v back to s yields a path of length d[v], which equals the true shortest distance.
4.8 Time Complexity (Revisited)
The correctness proof does not change the runtime:
Each vertex is discovered once. Each edge is examined at most twice (undirected case).
4.9 Structural Insight
BFS works because it enforces:
- monotonic distance growth (d increases by exactly 1 per layer)
- no revisits (coloring)
- strict layer order (FIFO queue)
Next: Part 5 — Complexity
We have proved correctness. Now we analyze the algorithm from a computational perspective. In Part 5, we formally examine the time and space complexity of BFS, and understand why it runs in O(|V| + |E|).