Union-Find
The Intuition
Picture a school cafeteria on the first day. Everyone sits alone. Then Alice and Bob become friends, so they join the same table. Bob already knows Carol from summer camp, so Carol joins too. Now Alice, Bob, and Carol are all at one table, even though Alice and Carol have never spoken directly. If someone asks "Are Alice and Carol in the same friend group?" the answer is yes, because there's a chain connecting them.
Union-Find tracks exactly this. Each person points to a "group leader." When you want to know someone's group, you follow the chain of pointers until you find the person who points to themselves. That's the leader. When two people from different groups become friends, you just make one leader point to the other. Two groups merge into one in a single step.
The clever trick is path compression. Imagine Alice is in a long chain: Alice points to Bob, Bob points to Carol, Carol points to Dave, and Dave is the leader. Every time you walk that chain to find the leader, you rewire everyone to point directly at Dave. The next time anyone asks about Alice, Bob, or Carol, the answer is instant. One hop, done.
Union by rank is the other optimization. When merging two groups, attach the shorter tree under the taller one. If you always glue the small tree onto the big tree's root, the overall tree stays flat. Combined with path compression, every operation runs in nearly constant time. Not O(log n). Not O(sqrt(n)). Nearly O(1), amortized. That's what makes Union-Find so fast for connectivity problems.
Number of connected components, detecting cycles in undirected graphs, Kruskal's MST algorithm, accounts merge, dynamic connectivity queries, or grouping problems.
- Need to split groups (union-find only merges, never splits)
- Directed graph connectivity (use DFS/BFS)
- Need path information between nodes (union-find only tells you if connected, not how)
Variations:
- Union by rank: Attach shorter tree under taller tree's root.
- Union by size: Attach smaller tree under larger tree's root.
- Weighted Union-Find: Track distances or relationships between nodes and their roots.
- Single element with no connections
- Two elements that are already connected before union
- All elements are disconnected (each is its own component)
Key Points
- •Path compression flattens the tree on find operations
- •Union by rank/size keeps trees balanced
- •Near-constant amortized time per operation, written O(α(n))
- •Great for dynamic connectivity queries
- •Track number of connected components with a counter
Code Template
1 class UnionFind:
2 def __init__(self, n):
3 self.parent = list(range(n))
4 self.rank = [0] * n
5 self.components = n
6
7 def find(self, x):
8 """Find root with path compression."""
9 if self.parent[x] != x:
10 self.parent[x] = self.find(self.parent[x])
11 return self.parent[x]
12
13 def union(self, x, y):
14 """Union by rank. Returns True if merged."""
15 px, py = self.find(x), self.find(y)
16 if px == py:
17 return False
18 if self.rank[px] < self.rank[py]:
19 px, py = py, px
20 self.parent[py] = px
21 if self.rank[px] == self.rank[py]:
22 self.rank[px] += 1
23 self.components -= 1
24 return True
25
26 def connected(self, x, y):
27 return self.find(x) == self.find(y)Common Mistakes
- Forgetting path compression (degrades to O(n) per find)
- Not using union by rank/size (creates tall trees)
- Confusing find(x) with x. Always use find to get the root