Mini Lecture: Running Time of Merging
COSC 311 Algorithms, Fall 2022
$
\def\compare{ {\mathrm{compare}} }
\def\swap{ {\mathrm{swap}} }
\def\sort{ {\mathrm{sort}} }
\def\insert{ {\mathrm{insert}} }
\def\true{ {\mathrm{true}} }
\def\false{ {\mathrm{false}} }
\def\BubbleSort{ {\mathrm{BubbleSort}} }
\def\SelectionSort{ {\mathrm{SelectionSort}} }
\def\Merge{ {\mathrm{Merge}} }
\def\MergeSort{ {\mathrm{MergeSort}} }
\def\QuickSort{ {\mathrm{QuickSort}} }
\def\Split{ {\mathrm{Split}} }
\def\Multiply{ {\mathrm{Multiply}} }
\def\Add{ {\mathrm{Add}} }
\def\cur{ {\mathrm{cur}} }
\def\gets{ {\leftarrow} }
$
Last Time
Kruskal’s Algorithm for MSTs:
- iterate over all edges in ascending order of weight
- if an edge connects two previously un-connected components, add it to MST
Kruskal’s Algorithm
Kruskal(V, E, w):
C <- collection of components
initially, each vertex is own component
F <- empty collection
# iterate in order of increasing weight
for each edge e = (u, v) in E
if u and v are in different components then
add (u, v) to F
merge components containing u and v
endif
endfor
return F
Maintaining Components
Associate a leader with each component
- leader is a vertex in the component
- maintain array of leaders
-
leader[i] = v
means that v
is leader of i
’s component
- for each leader
v
, maintain a (linked) list of elements in v
’s component
- list also stores size of the component
Illustration
Merging Components
To merge components with leaders $u$ and $v$
- Choose larger component’s leader to be new leader ($u$)
- Iterate over each vertex $x$ in $v$’s list and
- add $x$ to $u$’s list
- update
leader[x] <- u
Running time: $O(\text{size of smaller component})$
- time per element is $O(1)$
Simplistic Analysis
Kruskal(V, E, w):
C <- collection of components
initially, each vertex is own component
F <- empty collection
# iterate in order of increasing weight
for each edge e = (u, v) in E
if u and v are in different components then
add (u, v) to F
merge components containing u and v
endif
endfor
return F
Fewer Merges
Kruskal(V, E, w):
C <- collection of components
initially, each vertex is own component
F <- empty collection
# iterate in order of increasing weight
for each edge e = (u, v) in E
if u and v are in different components then
add (u, v) to F
merge components containing u and v
endif
endfor
return F
Amortized Cost of Merges
Consider the number of times each element’s leader is updated
Claim. If $x$ is relabeled $k$ times, then $x$’s component has size at least $2^k$.
Consequence 1. If $x$’s component has size $\ell$, then $x$ was relabeled at most $\log \ell$ times.
Consequence 2. Running time of all merge operations in Kruskal is $O(n \log n)$
Conclusion
Theorem. Kruskal’s algorithm can be implemented to run in time $O(m \log n)$ in graphs with $n$ vertices and $m$ edges.
- running time dominated by getting edges in ascending weight order
Remark. More efficient data structures for merging sets exist
- “Union-find” ADT, “disjoint-set forest” data structure
- time to perform merges is $O(n \alpha(n))$
- $\alpha(n)$ is “inverse Ackerman function”
- $\alpha(n)$ grows so slowly, it is practically constant