# Lecture 13: Balanced Binary Trees

## Overview

1. Recap of last time
2. AVL Trees
3. Maintaining AVL property

## Goal

Implement a sorted set (SimpleSSet) with efficient operations:

• find
• add
• remove

Previous best: sorted array with binary search

• find in $O(\log n)$ time
• add/remove in $O(n)$ time

## Last Time

• Introduced binary search trees
• “Implemented” basic sorted set operations
• find
• add
• remove
• Running time of operations determined by tree height
• height = length of longest path from root to leaf
• running times all $O(h)$
• Sequence of add/remove ops determines height

## Problem

Sequence of operations determines height!

• add: $5, 3, 8, 2, 4, 7, 9$ vs
• add: $2, 3, 4, 5, 7, 8, 9$

## Have We Failed?

If:

1. operation sequence determines height, and
2. height can be as large as $n-1$

Then:

• add, remove, find are $O(n)$ in the worst case

This is worse than a sorted array (find is $O(\log n)$)

What can we do about it?

## Restructuring Trees

Idea. When we modify the tree (add or remove), restructure the tree to maintain balance

• use fact that there are many valid BSTs

Challenges.

1. What structure do we want?
• how does structure guarantee efficient operations?
2. How do we check structure/modify to maintain structure?
3. Can we restructure tree efficiently?

## Idea

A binary tree $T$ is height balanced or an AVL tree (Adelson-Valsky & Landis) if for every node $v$ with children $u$ and $v$, we have $\vert h(u) - h(v)\vert \leq 1$.

We’ll show:

1. Any AVL tree with $n$ nodes has height $h = O(\log n)$
2. After a single add/remove operation, AVL property can be restored in $O(\log n)$ time

As a result

• AVL trees implement add, remove, and find for sorted sets all in time $O(\log n)$

## Height Balance

• $T$ a tree, $v$ a node in $T$ with left child $u$ and right child $w$
• $v$ is height balanced if $\vert h(u) - h(w)\vert \leq 1$
• convention: $h(\texttt{null}) = -1$

Which nodes are height balanced?

## Balanced Trees

$T$ is an AVL tree if every node is height balanced

Proposition. If $T$ is an AVL tree with $n$ nodes, then $h(T) = O(\log n)$.

• $\implies$ add/remove/find run in time $O(\log n)$ on $T$

• instead of showing that AVL tree $T$ with $n$ nodes has small height, show that AVL tree with height $h$ must have many nodes

## A Claim

Claim. Suppose $T$ is an AVL tree with height $h$. Then $T$ contains at least $2^{h/2}$ nodes.

Claim $\implies$ Proposition:

## Proof of Claim I

Claim. Suppose $T$ is an AVL tree with height $h$. Then $T$ contains at least $2^{h/2}$ nodes.

Idea. For a given height $h$, define $m(h)$ to be the minimum number nodes of any AVL tree with height $h$

Question. What is the structure of AVL tree with $m(h)$ nodes?

## Proof of Claim II

Claim. Suppose $T$ is an AVL tree with height $h$. Then $T$ contains at least $2^{h/2}$ nodes.

Symbolically. $m$ satisfies:

• $m(h) = m(h-1) + m(h-2)$ for $h \geq 2$
• $m(0) = 1$, $m(1) = 2$

Can use this to compute:

## Proof of Claim III

Claim. Suppose $T$ is an AVL tree with height $h$. Then $T$ contains at least $2^{h/2}$ nodes.

Using $m(h) = m(h-1) + m(h-2)$, derive a bound on $m(h)$:

## Next Challenge

Since Claim $\implies$ Proposition we can conclude:

• if $T$ is an AVL tree with $n$ nodes, then find/add/remove take time $O(\log n)$

However calling add/remove with previous implementation may destroy AVL property

Questions.

1. How much damage can a single add/remove do to balance?

2. Can balanced be restored efficiently?

## Strategy

Perform add/remove as before, then

1. check if AVL property is maintained,

2. if not, restructure graph to restore balance

## Maintaining AVL Property

What happens if we add(11)?

## Questions

If we add a new node as before, it is always a leaf.

• Which nodes could become unbalanced?
• How can we check for unbalance?

## Restoring Balance after add

Suppose $T$ becomes unbalanced after add

• $w$ is new node added
• $z$ is $w$’s deepest unbalanced ancestor
• $y$ is $z$’s child towards $w$
• $x$ is $y$’s child towards $w$

Note: 4 possibilities of relative order of $x, y, z$