Disjoint Set Union (DSU)

A data structure that tracks a set of elements partitioned into a number of disjoint (non-overlapping) subsets.

Key Notes

Disjoint Set Union (DSU), also known as Union-Find, is a data structure designed to efficiently manage a collection of disjoint sets. It provides two primary operations: `find`, which determines which set a particular element belongs to (by returning a representative or 'parent' of that set), and `union`, which merges two sets into a single set. A common application is in Kruskal's algorithm, where each vertex starts in its own set. When considering an edge `(u, v)`, we use `find(u)` and `find(v)` to check if they are already in the same set. If they are not, adding the edge won't create a cycle, so we perform a `union(u, v)` operation to merge their sets. A naive implementation of DSU can be inefficient, leading to a tree-like structure that resembles a linked list in the worst case. To combat this, two powerful optimizations are used: 'union by rank/size' and 'path compression'. Union by rank/size ensures that when merging two sets, the smaller tree is always attached to the root of the larger tree, which keeps the trees relatively shallow. Path compression flattens the structure of the tree during a `find` operation by making every node on the find path point directly to the root. With both optimizations, the amortized time complexity per operation becomes nearly constant.

Back to Minimum Spanning Trees

Disjoint Set Union (DSU)

A data structure that tracks a set of elements partitioned into a number of disjoint (non-overlapping) subsets.

Key Notes