# Lecture 10: Balancing BSTs

Scribe notes on balancing binary search trees

**Scribes:**

- Andy Arrigoni Perez
- Sawyer Pollard
- Luxin Sun
- Cesaire Mugishawayo

### Overview:

- Defining height balance
- Benefits of balance
- Maintaining balance efficiency
- Add
- Remove

## Section 1: Defining height balance

**Goal:** Since find, add and remove run at O(h) (Where h is the height of the tree, we want to support the add/remove methods to maintain h as small as we can.

**Definition of height:**

Let \(v\) be a node in a binary tree.

Then, \(h(v)= height\) (Distance to the most distant descendant leaf)

*Note:* Convention is that \(h(null) = -1\)

**Observation:**

If \(u = v.left\) and \(w = v.right\),

then \(h(v) = 1 + max(h(u), h(w))\)

**Definition:** Node \(v\) is height balanced if the heights of \(v\)’s children differ at most by 1.
*Formally:* \(\lvert h(u) - h(w) \rvert \leq 1\)

*Importantly, a binary tree T is (height) balanced or AVL (named for its creators Georgy Adelson-Velsky and Evgenii Landis) if all nodes in T are height balanced.*

**Example:**

## Section 2: Benefits of Balance

**Goal:** If T is balanced (or AVL), then its height \(h\) is \(O(log(n))\) where \(n\) is the number of nodes in the tree.

**Roundabout Method:**
Consider \(m(h) = \text{minimum number of nodes in AVL tree of height h}\)

If \(n \leq m(h)\), then the height of the tree is at most \(h\).

*What is \(m(h)\) for small values?*
m(0)=1
| \(m(h)\) values for small \(h\) |
|–|
| \(m(0) = 1\) |
| \(m(1) = 2\) |
| \(m(2) = 4\) |
| \(m(3) = 7\) |
| \(m(4) = 12\) |
| \(m(5) = 20\) |

*What can we say about the structure of an AVL tree of height \(h\) with a minimal number of nodes? (when height of the tree is at least 2)*

**Firstly,**

*(Note: \(m(h - 1) > m(h - 2)\))*

**So,**

**This pattern can be generalized to:** \(m(h) > 2^i * m(h - 2i)\)

**If,** \(i=\frac{h}{2} - 1\)

*(Note: “round up” \(\frac{h}{2}\) to the nearest integer.)*

**Then,** \(h-2i\) can only be \(0\) or \(1\).

**So,**

**Taking \(log\) of both sides:**

**Conclusion:** *If T is an AVL tree with \(n\) nodes and height \(h\), then:*

✅ **\(h = O(log(n))\)**

## Section 3: Maintaining Balance

**When can imbalance occur?**

- Only at an ancestor of a newly added node.
- Only in at most \(h\) nodes, since the tree was previously balanced, and the imbalance only occurs at the ancestors of the newly added node.

**Let:**

**Because of the properties of BSTs, we know that: \(Y < X < Z\)**

**What we want:**
(Picture of restructure)

**Questions worth thinking about:**

- Why does this restructure maintain BST property?
- Why does it restore balance?
- What is its running time?