Lecture 15: AVL Trees and Heaps

Announcement

No programming assignment this week!

Instead: short take-home quiz on binary search trees

• released Wednesday, to be completed by Friday
• start anytime, but 90 minute time limit
• should be much shorter than 90 minutes

Overview

1. Recap of AVL Trees
2. Rebalancing add/remvoe
3. Binary Heaps

Back to AVL Trees

Goal

Implement a sorted set (SimpleSSet) with efficient operations:

• find
• add
• remove

Previous best: sorted array with binary search

• find in $O(\log n)$ time
• add/remove in $O(n)$ time

Last Time

Introduced AVL trees

• $T$ has AVL property if for every node $v$ with children $u$ and $w$, we have

$\vert h(u) - h(w) \vert \leq 1$

We showed:

• if $T$ is an AVL tree with $n$ nodes, then $h(T) = O(\log n)$
• $\implies$ add/remove/find take $O(\log n)$ time

But:

• add/remove as previously implemented may destroy AVL property

Our Strategy:

1. perform add/remove as before
2. check if AVL property is maintained
3. if not, fix it

Maintaining AVL Property

What happens if we add(11)?

Questions

If we add a new node as before, it is always a leaf.

• Which nodes could become unbalanced?

• only ancesors of the added node
• there are $O(\log n)$ of these
• How can we check for unbalance?

• store height for each node
• update height of new node’s ancestors
• check each for imbalance

All this takes $O(\log n)$ time if $T$ is AVL tree (before add)

Restoring Balance after add

Suppose $T$ becomes unbalanced after add

• $w$ is new node added
• $z$ is $w$’s deepest unbalanced ancestor
• $y$ is $z$’s child towards $w$
• $x$ is $y$’s child towards $w$

Note: 4 possibilities of relative order of $x, y, z$

Idea

• $z$ must be either the largest or smallest of the three values (why?)
• $y$ must be $z$’s higher child (why?)
• $x$ must be $y$’s higher child (why?)

So: restructure tree to move $x$ up

• middle value of $x, y, z$ becomes root of sub-tree

Observations

Suppose $z$ became unbalanced after add(w)

• $z$’s previous height was $h$
• $z$’s new neight is $h+1$
• $z$’s other child has height $h-1$

What is root’s height after restructuring?

Question 1

Why does restructuring restore balance of subtree?

Question 2

Why does restructuring maintain BST property?

Question 3

Does restructuring make tree balanced?

Rebalancing Procedure

After add(w), iterate over $w$’s ancestors from $w$ upwards:

1. re-compute height and check balance
2. if unbalanced
• perform restructure
• update heights

What is running time?

Question

How do we remove nodes?

Remove Procedure

Case 1: Leaf

Case 2: Single-child

Case 3: Two-child

Which cases modify tree structure?

Similar Picture to Before

$x, y, z$ redefined; same restructuring

Question

What is height of root after restructuring?

Issue

Restructuring may again cause imblance! Must continue upwards.

Removal

After remove(x), iterate over removed node’s ancestors upwards:

1. re-compute height and check balance
2. if unbalanced
• perform restructure
• update heights

What is running time?

Conclusion

We can modify add/remove such that

1. operations restore AVL property if tree was initially AVL tree
2. operations still run in $O(h) = O(\log n)$ time

So

• All add/remove/find operations happen in $O(\log n)$ time!

Testing an Implementation

Compare performance of BST (as in Assignment 06) and AVL tree implementation

Tests:

2. add words of Shakespeare (~1M words)
3. add words from dictionary (alphabetical)

Time to add elements (ms):
BST: 770
AVL: 601
Heights:
BST: 45
AVL: 22
Time to find elements (ms):
BST: 120
AVL: 88
Time to remove elements (ms):
BST: 89
AVL: 137


Time to add Shakespeare's vocabulary (ms):
BST: 287
AVL: 246
Heights:
BST: 36
AVL: 17


Test 3: Adding Words from Dictionary

Time to add words from dictionary (ms):
BST: 639
AVL: 2
Heights:
BST: 9999
AVL: 13


Interpretations of Running Times?

(Dis)advantages of BST vs AVL tree?

Conclusions

2 Ingredients:

1. Binary search trees
• restriction on values
2. Balanced binary trees
• restriction on tree structure

AVL trees give efficient worst-case performance

• add, remove, find all in $O(\log n)$ time

Performance between BST vs AVL trees depends on usage

• often unbalanced BST is sufficient
• AVL exponentially faster sometimes (e.g., adding in sorted order)

Binary Heaps

Another Tree Representation:

• Binary Heaps

Goal. Implement a priority queue with $O(\log n)$-time operations

Exercise. How could this goal be achieved with an AVL tree?

Binary Heap Structure

1. Binary tree stores comparable elemements
2. Heap property (restriction on values)
• children always store larger values than parent
3. Tree structure: complete binary tree
• all leaves have (almost) same depth
• very restrictive!

Goal. Use & maintain these properties for an efficient implementation of a priority queue:

• add(x)
• min()
• removeMin()

Heap Property

$T$ a binary tree

• each node stores a comparable element, $v$

$T$ has the heap property if for every node storing value $v$ with children $u$ and $w$, we have $v \leq u$ and $v \leq w$.

Question

If $T$ has the heap property, what can we say about the value of the root?

Complete Binary Tree, Formally

$T$ is a complete binary tree of depth $D$ if:

1. every node at depth $d \leq D - 2$ has 2 children
2. if $v$ is at depth $D - 1$ and $v$ has a child, then every depth $D-1$ node to the left of $v$ has $2$ children
3. if $v$ is at depth $D-1$ and $D$ has fewer than $2$ children, then every depth $D-1$ node to the right of $v$ has no children

Question 1

If $T$ is a complete binary tree, where can we add a node to maintain completeness?

Question 2

If $T$ is a complete binary tree, what nodes can we remove and maintain completeness?

Binary Heaps

A binary tree $T$ storing comparable elements is a binary heap if:

1. $T$ satisfies the heap property, and
2. $T$ is a complete binary tree

Heap Priority Queue: min

Given a binary heap $T$, how can we implement min()?

Given a binary heap $T$, how can we add an element?
Given a binary heap $T$, how can we removeMin?