Problem — How do we detect cycles efficiently?
In Part 2, we saw that Kruskal’s algorithm adds the lightest edge that does not create a cycle. But in large graphs, checking cycles manually is not practical.
To solve this efficiently, we use a data structure called Union-Find, also known as Disjoint Set Union (DSU).
Instead of searching for paths in the graph, Union-Find keeps track of which vertices are already connected.
What does Union-Find do?
The structure maintains vertices grouped into disjoint sets, representing connected components.
Initially, each vertex is in its own set. As Kruskal accepts edges, these sets are merged.
To check if an edge is safe, we simply ask: are both endpoints in the same set?
The two core operations
Union-Find relies on two fundamental operations:
Find(x): returns the representative of the set containing x.
Union(x, y): merges the sets containing x and y.
In Kruskal, the rule becomes:
if Find(u) ≠ Find(v), accept the edge and apply Union(u, v).
Why does this prevent cycles?
If two vertices are already in the same set, there is already a path connecting them.
Adding another edge between them would create a loop, which is exactly a cycle.
If they are in different sets, the edge simply connects two separate components.
Example with sets
Suppose at some point we have:
If we consider edge C–D, the vertices belong to different sets, so we accept the edge.
After that:
Now, adding an edge between B and C would be rejected, since they are already connected.
Why is this important?
Kruskal processes many edges. Without an efficient structure, cycle detection would be slow.
Union-Find makes these checks extremely fast, especially with optimizations like path compression and union by rank.
This is why Kruskal is both elegant and efficient in practice.
Conceptual pseudocode
1. Sort edges by weight
2. Initialize each vertex as its own set
3. For each edge (u, v):
• if Find(u) ≠ Find(v), accept edge
• Union(u, v)
4. Stop after n − 1 edges
Cycle detection is reduced to a simple comparison between representatives.
Key intuition for Part 3
Union-Find does not track the full structure of the graph. It only tracks what matters: which vertices are already connected.
This is enough for Kruskal to make correct and efficient decisions.
In the next part, we will compare Kruskal with Prim’s algorithm and discuss when each approach is preferable.