Lecture 20: Minimum Spanning Trees, Part 3

COSC 311 Algorithms, Fall 2022

$ \def\compare{ {\mathrm{compare}} } \def\swap{ {\mathrm{swap}} } \def\sort{ {\mathrm{sort}} } \def\insert{ {\mathrm{insert}} } \def\true{ {\mathrm{true}} } \def\false{ {\mathrm{false}} } \def\BubbleSort{ {\mathrm{BubbleSort}} } \def\SelectionSort{ {\mathrm{SelectionSort}} } \def\Merge{ {\mathrm{Merge}} } \def\MergeSort{ {\mathrm{MergeSort}} } \def\QuickSort{ {\mathrm{QuickSort}} } \def\Split{ {\mathrm{Split}} } \def\Multiply{ {\mathrm{Multiply}} } \def\Add{ {\mathrm{Add}} } \def\cur{ {\mathrm{cur}} } \def\gets{ {\leftarrow} } $

Announcements

Masks still required in class
No class on Monday 10/24
HW 03, Question 1:
- $n = 2^B$
- array contains $0, 1, \ldots, n - 1$ (not in order)
- values represented as $B$ bit numbers
HW 03 now due Sunday

Last Time

Prim’s algorithm for Minimum Spanning Trees:

Grow a tree from an arbitrary seed vertex
Each step, add minimum weight edge out of tree

Cut Claim:

if $T$ an MST, $U, V - U$ a cut, $e$ min weight cut edge
then $T$ contains $e$

Prim correctness follows from cut claim

MSTs, Another Way

Prim:

Grow tree greedily from a single seed vertex
Maintain a (connected) tree

Edge Centric View:

Maintain a collection of edges (not necessarily a tree)
Add edges to collection to eventually build an MST

Questions:

How to prioritize edges?
How to determine whether or not to include an edge?

Picture

Kruskal’s Algorithm

  Kruskal(V, E, w):
    C <- collection of components
      initially, each vertex is own component
    F <- empty collection
    # iterate in order of increasing weight
    for each edge e = (u, v) in E 
      if u and v are in different components then
        add (u, v) to F
        merge components containing u and v
      endif
    endfor
    return F

Kruskal Illustration

Kruskal Correctness I

Claim 1. Every edge added by Kruskal must be in every MST.

Why?

Suppose $e = (u, v)$ added by Kruskal
Consider the cut $U, V - U$ where $U$ is $u$’s component
$e$ is lightest edge across the cut (why?)
therefore $e$ must be in MST (why?)

Kruskal Correctness II

Claim 2. Kruskal produces a spanning tree.

Why?

edges added by Kruskal do not contain cycles (why?)
edges added by Kruskal connect graph (why?)

Conclusion

Theorem. Kruskal’s algorithm produces an MST.

Next Question. How could we implement Kruskal’s algorithm efficiently? What is its running time?

Kruskal’s Algorithm

  Kruskal(V, E, w):
    C <- collection of components
      initially, each vertex is own component
    F <- empty collection
    # iterate in order of increasing weight
    for each edge e = (u, v) in E 
      if u and v are in different components then
        add (u, v) to F
        merge components containing u and v
      endif
    endfor
    return F

Costly Operations

Get edges in order of increasing weight
Determine if $u$ and $v$ are in same component
Merge two components

Question. How to get edges in order of increasing weight?

Maintaining Components

Idea. For each component, designate a leader

leader is a vertex in its component
maintain an array that stores each vertex’s component’s leader
- leader[i] = v means that v is leader of i’s component

Question. How to check if vertices i and j are in the same component? Running time?

Merging Components

Question. How to merge two components?

Merging Components Efficiently?

For each leader, maintain list of vertices in its component.

To merge components with leaders $u$ and $v$:

choose $u$ or $v$ to be leader of merged component
- how?
if $u$ is new leader
- for each vertex $x$ on $v$’s list
  - add $x$ to $u$’s list
  - set $x$’s leader to $v$

Question. Running time?

Merging Strategy

When merging components with leaders $u$ and $v$, new leader is leader of larger component

Claim. If $x$ is relabeled $k$ times, then $x$’s component has size at least $2^k$.

Consequence

Claim. If $x$ is relabeled $k$ times, then $x$’s component has size at least $2^k$.

Consequence 1. If $x$’s component has size $\ell$, then $x$ was relabeled at most $\log \ell$ times.

Consequence 2. Running time of all merge operations in Kruskal is $O(n \log n)$

Conclusion

Theorem. Kruskal’s algorithm can be implemented to run in time $O(m \log n)$ in graphs with $n$ vertices and $m$ edges.

Remark. More efficient data structures for merging sets exist

“Union-find” ADT, “disjoint-set forest” data structure
time to perform merges is $O(n \alpha(n))$
- $\alpha(n)$ is “inverse Ackerman function”
- $\alpha(n)$ grows so slowly, it is practically constant

Next Time

Interval Scheduling (recorded lecture)