Lecture 16: Balance and Heaps

Announcement

No programming assignment this week!

Instead: short take-home quiz on binary search trees

  • released Wednesday, to be completed by Friday
  • start anytime, but 90 minute time limit
    • should be much shorter than 90 minutes

Overview

  1. Cliffhanger Resolution
  2. AVL Trees, Empirically
  3. Binary Heaps

Last Time

  • AVL Trees
    • height balanced trees
    • $h(T) = O(\log n) \implies$ add/remove/find in $O(\log n)$ time
    • restore balance after add/remove operation in $O(\log n)$ time?
  • Restructure operation after add to restore balance
  • Restructure operation after remove?

Element/Node Removal

Case 1: Leaf

Case 2: Single-child

Case 3: Two-child

How to Restructure After Removal?

Cliffhanger Example

Cliffhanger Resolution

  • $z$ is deepest unbalanced node
  • $y$ is higher child (away from removed node)
  • if $y$’s children have same height
    • if $y$ is $z$’s left child, choose $x$ to be $y$’s left child
    • if $y$ is $z$’s right child, choose $x$ to be $y$’s right child

Question

What is height of root after restructuring?

Issue

Restructuring may again cause imblance! Must continue upwards.

Removal

After remove(x), iterate over removed node’s ancestors upwards:

  1. re-compute height and check balance
  2. if unbalanced
    • perform restructure
    • update heights

What is running time?

Conclusion

We can modify add/remove such that

  1. operations restore AVL property if tree was initially AVL tree
  2. operations still run in $O(h) = O(\log n)$ time

So

  • All add/remove/find operations happen in $O(\log n)$ time!

Testing an Implementation

Compare performance of BST (as in Assignment 06) and AVL tree implementation

Tests:

  1. add randomly chosen elements
  2. add words of Shakespeare (~1M words)
  3. add words from dictionary (10k words, alphabetical)

Test 1: Adding Random Elements

Time to add elements (ms):
  BST: 770
  AVL: 601
Heights:
  BST: 45
  AVL: 22
Time to find elements (ms):
  BST: 120
  AVL: 88
Time to remove elements (ms):
  BST: 89
  AVL: 137

Test 2: Adding Shakespeare

Time to add Shakespeare's vocabulary (ms):
  BST: 287
  AVL: 246
Heights:
  BST: 36
  AVL: 17

Test 3: Adding Words from Dictionary

Time to add words from dictionary (ms):
  BST: 639
  AVL: 2
Heights:
  BST: 9999
  AVL: 13

Interpretations of Running Times?

(Dis)advantages of BST vs AVL tree?

Conclusions

2 Ingredients:

  1. Binary search trees
    • restriction on values
  2. Balanced binary trees
    • restriction on tree structure

AVL trees give efficient worst-case performance

  • add, remove, find all in $O(\log n)$ time

Performance between BST vs AVL trees depends on usage

  • often unbalanced BST is sufficient
  • AVL exponentially faster sometimes (e.g., adding in sorted order)

Binary Heaps

Another Tree Representation:

  • Binary Heaps

Goal. Implement a priority queue with $O(\log n)$-time operations

Exercise. How could this goal be achieved with an AVL tree?

Binary Heap Structure

  1. Binary tree stores comparable elemements
    • comparison by priority
  2. Heap property (restriction on values)
    • children always store larger values than parent
  3. Tree structure: complete binary tree
    • all leaves have (almost) same depth
    • very restrictive!

Goal. Use & maintain these properties for an efficient implementation of a priority queue:

  • add(x, p)
  • min()
  • removeMin()

Heap Property

$T$ a binary tree

  • each node stores a comparable element, $v$

$T$ has the heap property if for every node storing value $v$ with children $u$ and $w$, we have $v \leq u$ and $v \leq w$.

Question

If $T$ has the heap property, what can we say about the value of the root?

Complete Binary Tree, In Pictures

Complete Binary Tree, Formally

$T$ is a complete binary tree of depth $D$ if:

  1. every node at depth $d \leq D - 2$ has 2 children
  2. if $v$ is at depth $D - 1$ and $v$ has a child, then every depth $D-1$ node to the left of $v$ has $2$ children
  3. if $v$ is at depth $D-1$ and $D$ has fewer than $2$ children, then every depth $D-1$ node to the right of $v$ has no children

Question 1

If $T$ is a complete binary tree, where can we add a node to maintain completeness?

Question 2

If $T$ is a complete binary tree, what nodes can we remove and maintain completeness?

Question 3

If $T$ is a complete binary tree with $n$ nodes, what is its depth?

Binary Heaps

A binary tree $T$ storing comparable elements is a binary heap if:

  1. $T$ satisfies the heap property, and
  2. $T$ is a complete binary tree

Heap Priority Queue: min

Given a binary heap $T$, how can we implement min()?

Heap Priority Queue: add

Given a binary heap $T$, how can we add an element?

“Bubble Up” Procedure

  1. Add element at unique location where a node can be added, w
  2. Repeat
    • if w < w.parent, swap w and w.parent
    • else break

Why Does Bubble Up Work?

  1. Add element at unique location where a node can be added, w
  2. Repeat
    • if w < w.parent, swap w and w.parent
    • else break

What is Bubble Up Running Time?

  1. Add element at unique location where a node can be added, w
  2. Repeat
    • if w < w.parent, swap w and w.parent
    • else break

Heap Priority Queue: removeMin

Given a binary heap $T$, how can we removeMin?

“Trickle Down” Procedure

  1. Copy value from unique removable leaf to root and remove leaf
  2. Set w to root
  3. Repeat:
    • if w > some child, v = smaller child
      • swap v and w values
    • else break

Why does “Trickle Down” Work?

  1. Set w to root
  2. Repeat
    • if w > child, v = smaller child
      • swap v and w values
    • else break

“Trickle Down” Running Time?

  1. Set w to root
  2. Repeat
    • if w > child, v = smaller child
      • swap v and w values
    • else break

Representing Complete Binary Trees

Previously:

  • trees represented as linked nodes

Complete binary trees have much more predictable structure

  • can use an array to store complete binary trees efficiently!

Nodes vs Array Index

Question 1

For an index $i$, what is the index of $i$’s left child? Right child?

Question 2

For an index $i$, what is the index of $i$’s parent?

Question 3

Why didn’t we use arrays to represent (non complete) binary trees?

Next Assignment

Implement a priority queue using a binary heap!