Lecture 13: Balanced Binary Trees

Overview

1. Recap of last time
2. AVL Trees
3. Maintaining AVL property

Goal

Implement a sorted set (SimpleSSet) with efficient operations:

• find
• add
• remove

Previous best: sorted array with binary search

• find in $O(\log n)$ time
• add/remove in $O(n)$ time

Last Time

• Introduced binary search trees
• “Implemented” basic sorted set operations
• find
• add
• remove
• Running time of operations determined by tree height
• height = length of longest path from root to leaf
• running times all $O(h)$
• Sequence of add/remove ops determines height

Problem

Sequence of operations determines height!

• add: $5, 3, 8, 2, 4, 7, 9$ vs
• add: $2, 3, 4, 5, 7, 8, 9$

Have We Failed?

If:

1. operation sequence determines height, and
2. height can be as large as $n-1$

Then:

• add, remove, find are $O(n)$ in the worst case

This is worse than a sorted array (find is $O(\log n)$)

What can we do about it?

Restructuring Trees

Idea. When we modify the tree (add or remove), restructure the tree to maintain balance

• use fact that there are many valid BSTs

Challenges.

1. What structure do we want?
• how does structure guarantee efficient operations?
2. How do we check structure/modify to maintain structure?
3. Can we restructure tree efficiently?

Idea

A binary tree $T$ is height balanced or an AVL tree (Adelson-Valsky & Landis) if for every node $v$ with children $u$ and $v$, we have $\vert h(u) - h(v)\vert \leq 1$.

We’ll show:

1. Any AVL tree with $n$ nodes has height $h = O(\log n)$
2. After a single add/remove operation, AVL property can be restored in $O(\log n)$ time

As a result

• AVL trees implement add, remove, and find for sorted sets all in time $O(\log n)$

Height Balance

• $T$ a tree, $v$ a node in $T$ with left child $u$ and right child $w$
• $v$ is height balanced if $\vert h(u) - h(w)\vert \leq 1$
• convention: $h(\texttt{null}) = -1$

Which nodes are height balanced?

Balanced Trees

$T$ is an AVL tree if every node is height balanced

Proposition. If $T$ is an AVL tree with $n$ nodes, then $h(T) = O(\log n)$.

• $\implies$ add/remove/find run in time $O(\log n)$ on $T$

• instead of showing that AVL tree $T$ with $n$ nodes has small height, show that AVL tree with height $h$ must have many nodes

A Claim

Claim. Suppose $T$ is an AVL tree with height $h$. Then $T$ contains at least $2^{h/2}$ nodes.

Claim $\implies$ Proposition:

Proof of Claim I

Claim. Suppose $T$ is an AVL tree with height $h$. Then $T$ contains at least $2^{h/2}$ nodes.

Idea. For a given height $h$, define $m(h)$ to be the minimum number nodes of any AVL tree with height $h$

Question. What is the structure of AVL tree with $m(h)$ nodes?

Proof of Claim II

Claim. Suppose $T$ is an AVL tree with height $h$. Then $T$ contains at least $2^{h/2}$ nodes.

Symbolically. $m$ satisfies:

• $m(h) = m(h-1) + m(h-2)$ for $h \geq 2$
• $m(0) = 1$, $m(1) = 2$

Can use this to compute:

Proof of Claim III

Claim. Suppose $T$ is an AVL tree with height $h$. Then $T$ contains at least $2^{h/2}$ nodes.

Using $m(h) = m(h-1) + m(h-2)$, derive a bound on $m(h)$:

Next Challenge

Since Claim $\implies$ Proposition we can conclude:

• if $T$ is an AVL tree with $n$ nodes, then find/add/remove take time $O(\log n)$

However calling add/remove with previous implementation may destroy AVL property

Questions.

1. How much damage can a single add/remove do to balance?

2. Can balanced be restored efficiently?

Strategy

Perform add/remove as before, then

1. check if AVL property is maintained,

2. if not, restructure graph to restore balance

Maintaining AVL Property

What happens if we add(11)?

Questions

If we add a new node as before, it is always a leaf.

• Which nodes could become unbalanced?
• How can we check for unbalance?

Restoring Balance after add

Suppose $T$ becomes unbalanced after add

• $w$ is new node added
• $z$ is $w$’s deepest unbalanced ancestor
• $y$ is $z$’s child towards $w$
• $x$ is $y$’s child towards $w$

Note: 4 possibilities of relative order of $x, y, z$