Lecture 15: AVL Trees and Heaps

Announcement

No programming assignment this week!

Instead: short take-home quiz on binary search trees

  • released Wednesday, to be completed by Friday
  • start anytime, but 90 minute time limit
    • should be much shorter than 90 minutes

Overview

  1. Recap of AVL Trees
  2. Rebalancing add/remvoe
  3. Binary Heaps

Back to AVL Trees

Goal

Implement a sorted set (SimpleSSet) with efficient operations:

  • find
  • add
  • remove

Previous best: sorted array with binary search

  • find in $O(\log n)$ time
  • add/remove in $O(n)$ time

Last Time

Introduced AVL trees

  • $T$ has AVL property if for every node $v$ with children $u$ and $w$, we have

    $\vert h(u) - h(w) \vert \leq 1$

We showed:

  • if $T$ is an AVL tree with $n$ nodes, then $h(T) = O(\log n)$
    • $\implies$ add/remove/find take $O(\log n)$ time

But:

  • add/remove as previously implemented may destroy AVL property

Our Strategy:

  1. perform add/remove as before
  2. check if AVL property is maintained
  3. if not, fix it

Maintaining AVL Property

What happens if we add(11)?

Questions

If we add a new node as before, it is always a leaf.

  • Which nodes could become unbalanced?

    • only ancesors of the added node
    • there are $O(\log n)$ of these
  • How can we check for unbalance?

    • store height for each node
    • update height of new node’s ancestors
    • check each for imbalance

All this takes $O(\log n)$ time if $T$ is AVL tree (before add)

Restoring Balance after add

Suppose $T$ becomes unbalanced after add

  • $w$ is new node added
  • $z$ is $w$’s deepest unbalanced ancestor
  • $y$ is $z$’s child towards $w$
  • $x$ is $y$’s child towards $w$

Note: 4 possibilities of relative order of $x, y, z$

Picture with Sub-trees

Idea

  • $z$ must be either the largest or smallest of the three values (why?)
  • $y$ must be $z$’s higher child (why?)
  • $x$ must be $y$’s higher child (why?)

So: restructure tree to move $x$ up

  • middle value of $x, y, z$ becomes root of sub-tree

Picture of Restructuring

Observations

Suppose $z$ became unbalanced after add(w)

  • $z$’s previous height was $h$
  • $z$’s new neight is $h+1$
  • $z$’s other child has height $h-1$

What is root’s height after restructuring?

Question 1

Why does restructuring restore balance of subtree?

Question 2

Why does restructuring maintain BST property?

Question 3

Does restructuring make tree balanced?

Rebalancing Procedure

After add(w), iterate over $w$’s ancestors from $w$ upwards:

  1. re-compute height and check balance
  2. if unbalanced
    • perform restructure
    • update heights

What is running time?

4 Cases

Question

How do we remove nodes?

Remove Procedure

Case 1: Leaf

Case 2: Single-child

Case 3: Two-child

Which cases modify tree structure?

How to Restructure After Removal?

Similar Picture to Before

$x, y, z$ redefined; same restructuring

Question

What is height of root after restructuring?

Issue

Restructuring may again cause imblance! Must continue upwards.

Removal

After remove(x), iterate over removed node’s ancestors upwards:

  1. re-compute height and check balance
  2. if unbalanced
    • perform restructure
    • update heights

What is running time?

Conclusion

We can modify add/remove such that

  1. operations restore AVL property if tree was initially AVL tree
  2. operations still run in $O(h) = O(\log n)$ time

So

  • All add/remove/find operations happen in $O(\log n)$ time!

Testing an Implementation

Compare performance of BST (as in Assignment 06) and AVL tree implementation

Tests:

  1. add randomly chosen elements
  2. add words of Shakespeare (~1M words)
  3. add words from dictionary (alphabetical)

Test 1: Adding Random Elements

Time to add elements (ms):
  BST: 770
  AVL: 601
Heights:
  BST: 45
  AVL: 22
Time to find elements (ms):
  BST: 120
  AVL: 88
Time to remove elements (ms):
  BST: 89
  AVL: 137

Test 2: Adding Shakespeare

Time to add Shakespeare's vocabulary (ms):
  BST: 287
  AVL: 246
Heights:
  BST: 36
  AVL: 17

Test 3: Adding Words from Dictionary

Time to add words from dictionary (ms):
  BST: 639
  AVL: 2
Heights:
  BST: 9999
  AVL: 13

Interpretations of Running Times?

(Dis)advantages of BST vs AVL tree?

Conclusions

2 Ingredients:

  1. Binary search trees
    • restriction on values
  2. Balanced binary trees
    • restriction on tree structure

AVL trees give efficient worst-case performance

  • add, remove, find all in $O(\log n)$ time

Performance between BST vs AVL trees depends on usage

  • often unbalanced BST is sufficient
  • AVL exponentially faster sometimes (e.g., adding in sorted order)

Binary Heaps

Another Tree Representation:

  • Binary Heaps

Goal. Implement a priority queue with $O(\log n)$-time operations

Exercise. How could this goal be achieved with an AVL tree?

Binary Heap Structure

  1. Binary tree stores comparable elemements
  2. Heap property (restriction on values)
    • children always store larger values than parent
  3. Tree structure: complete binary tree
    • all leaves have (almost) same depth
    • very restrictive!

Goal. Use & maintain these properties for an efficient implementation of a priority queue:

  • add(x)
  • min()
  • removeMin()

Heap Property

$T$ a binary tree

  • each node stores a comparable element, $v$

$T$ has the heap property if for every node storing value $v$ with children $u$ and $w$, we have $v \leq u$ and $v \leq w$.

Question

If $T$ has the heap property, what can we say about the value of the root?

Complete Binary Tree, In Pictures

Complete Binary Tree, Formally

$T$ is a complete binary tree of depth $D$ if:

  1. every node at depth $d \leq D - 2$ has 2 children
  2. if $v$ is at depth $D - 1$ and $v$ has a child, then every depth $D-1$ node to the left of $v$ has $2$ children
  3. if $v$ is at depth $D-1$ and $D$ has fewer than $2$ children, then every depth $D-1$ node to the right of $v$ has no children

Question 1

If $T$ is a complete binary tree, where can we add a node to maintain completeness?

Question 2

If $T$ is a complete binary tree, what nodes can we remove and maintain completeness?

Binary Heaps

A binary tree $T$ storing comparable elements is a binary heap if:

  1. $T$ satisfies the heap property, and
  2. $T$ is a complete binary tree

Heap Priority Queue: min

Given a binary heap $T$, how can we implement min()?

Heap Priority Queue: add

Given a binary heap $T$, how can we add an element?

Heap Priority Queue: removeMin

Given a binary heap $T$, how can we removeMin?