Algorithm Design → Search → Breadth-First Search (BFS) · Part 4

4. Correctness Proof

Roadmap (what you will learn in this part):

State the correctness goal precisely
Introduce the core invariant
Prove BFS never underestimates distances
Use FIFO to justify layered exploration
Prove first discovery gives a shortest path
Conclude BFS computes shortest-path distances
Show the predecessor pointers form a shortest-path tree
Revisit time complexity
Extract the structural insight

This part formalizes what we claimed in Part 1: BFS computes shortest-path distances in unweighted graphs.

The proof is structural. The queue is not an implementation detail: it is the reason the argument works.

4.1 What Must Be Proven

Let G = (V, E) be a graph and s ∈ V the source. Define δ(s, v) as the true shortest-path distance (number of edges) from s to v.

We must prove that BFS computes:

d[v] = δ(s, v)

for every vertex v reachable from s.

Reminder: BFS sets distances only when a vertex is first discovered (WHITE → GRAY).

4.2 Key Invariant

The proof relies on one core invariant:

Invariant: When a vertex v is first discovered (colored GRAY), its distance satisfies d[v] = δ(s, v).

Intuitively: the first time we reach v, we reached it via the shortest layer.

We will justify this using the queue’s FIFO discipline (layered exploration).

4.3 Lemma 1 — Distances Never Underestimate

Claim: For every vertex v, BFS never underestimates the true distance:

d[v] \geq δ(s, v)

Proof sketch

BFS only assigns d[v] via an edge (u, v) ∈ E:

d[v] = d[u] + 1

Any path to v through u has length at least δ(s, u) + 1, so assigning d[u] + 1 cannot be smaller than the true optimum. Therefore BFS does not produce distances smaller than reality.

4.4 Lemma 2 — FIFO Ensures Layered Exploration

The queue guarantees that vertices are processed in non-decreasing distance order:

Layer property: All vertices with distance k are dequeued before any vertex with distance k + 1.

Why? When a vertex u is dequeued, every newly discovered neighbor v is assigned d[v] = d[u] + 1 and appended to the end of the queue. So the queue forms consecutive layers.

Formal diagram (layers):

Layer L0:  { s }                     d = 0
             |
Layer L1:  { neighbors of s }        d = 1
             |
Layer L2:  { neighbors of L1 }       d = 2
             |
Layer L3:  { neighbors of L2 }       d = 3
...

4.5 Lemma 3 — First Discovery Is Optimal

Claim: When v is first discovered, BFS assigns the true shortest distance:

d[v] = δ(s, v)

Proof (by contradiction)

Suppose BFS first discovers v with d[v] > δ(s, v). Consider a shortest path:

s = v 0, v 1, \dots, v k = v

Then v_k-1 satisfies δ(s, v_k-1) = δ(s, v) − 1, so it lies in an earlier layer. By Lemma 2, it must be dequeued before any vertex in layer δ(s, v) is discovered.

When BFS processes v_k-1, it examines edge (v_k-1, v) and would assign:

d[v] = d[v k-1] + 1 = δ(s, v)

Contradiction. Therefore first discovery assigns the optimal distance.

4.6 Theorem — BFS Computes Shortest Paths

From Lemma 3, every reachable vertex v receives d[v] = δ(s, v) at first discovery.

Conclusion: BFS computes correct shortest-path distances in unweighted graphs.

4.7 The BFS Tree Is a Shortest-Path Tree

Every vertex v ≠ s receives exactly one predecessor π[v]. Moreover, when π[v] is set, BFS enforces:

d[v] = d[π[v]] + 1

Therefore, following predecessor pointers from v back to s yields a path of length d[v], which equals the true shortest distance.

Result: The predecessor subgraph is a tree rooted at s, and every root-to-vertex path is a shortest path.

4.8 Time Complexity (Revisited)

The correctness proof does not change the runtime:

O(|V| + |E|)

Each vertex is discovered once. Each edge is examined at most twice (undirected case).

4.9 Structural Insight

BFS works because it enforces:

monotonic distance growth (d increases by exactly 1 per layer)
no revisits (coloring)
strict layer order (FIFO queue)

Takeaway: The queue is not an implementation detail. It is the structural reason the proof works.

Next: Part 5 — Complexity

We have proved correctness. Now we analyze the algorithm from a computational perspective. In Part 5, we formally examine the time and space complexity of BFS, and understand why it runs in O(|V| + |E|).

Continue: Part 5 — Complexity →

Discrete Mathsfor Engineers

Breadth-First Search (BFS)