Discrete Mathsfor Engineers

Structures, proofs, and intuition

EN | PT

Graph Theory → Kruskal

Part 3 of 4 — Union-Find and Efficient Cycle Detection

Series: Kruskal Part 3 of 4
1 2 3 4

Problem — How do we detect cycles efficiently?

In Part 2, we saw that Kruskal’s algorithm adds the lightest edge that does not create a cycle. But in large graphs, checking cycles manually is not practical.

To solve this efficiently, we use a data structure called Union-Find, also known as Disjoint Set Union (DSU).

Instead of searching for paths in the graph, Union-Find keeps track of which vertices are already connected.

Key idea: track connectivity, not paths.

What does Union-Find do?

The structure maintains vertices grouped into disjoint sets, representing connected components.

Initially, each vertex is in its own set. As Kruskal accepts edges, these sets are merged.

To check if an edge is safe, we simply ask: are both endpoints in the same set?

Summary: Union-Find tracks which vertices are already connected.

The two core operations

Union-Find relies on two fundamental operations:

Find(x): returns the representative of the set containing x.

Union(x, y): merges the sets containing x and y.

In Kruskal, the rule becomes:

if Find(u) ≠ Find(v), accept the edge and apply Union(u, v).

Kruskal rule: accept edges only when representatives are different.

Why does this prevent cycles?

If two vertices are already in the same set, there is already a path connecting them.

Adding another edge between them would create a loop, which is exactly a cycle.

If they are in different sets, the edge simply connects two separate components.

Core idea: same set → cycle, different sets → safe edge.

Example with sets

Suppose at some point we have:

Current sets: {A, B, C} and {D, E, F}

If we consider edge C–D, the vertices belong to different sets, so we accept the edge.

After that:

After Union(C, D): {A, B, C, D, E, F}

Now, adding an edge between B and C would be rejected, since they are already connected.

Why is this important?

Kruskal processes many edges. Without an efficient structure, cycle detection would be slow.

Union-Find makes these checks extremely fast, especially with optimizations like path compression and union by rank.

This is why Kruskal is both elegant and efficient in practice.

In practice: sorting + Union-Find = efficient Kruskal.

Conceptual pseudocode

Kruskal:
1. Sort edges by weight
2. Initialize each vertex as its own set
3. For each edge (u, v):
  • if Find(u) ≠ Find(v), accept edge
  • Union(u, v)
4. Stop after n − 1 edges

Cycle detection is reduced to a simple comparison between representatives.

Key intuition for Part 3

Union-Find does not track the full structure of the graph. It only tracks what matters: which vertices are already connected.

This is enough for Kruskal to make correct and efficient decisions.

In the next part, we will compare Kruskal with Prim’s algorithm and discuss when each approach is preferable.

Next step: Kruskal vs Prim comparison.