# Lecture 11: Binary Search Trees & Balance

## Overview

1. Binary Trees
2. Binary Search Trees
3. (Height) Balanced Binary Trees

## Last Time: Binary Tree

A binary tree consists of

• a collection of nodes
• a distinguished node called the root
• each node has
• a parent (null only for root)
• a left child
• a right child

Constraints:

1. if u is a child of v, then v is u’s parent
2. every node has the root as an ancestor
• $\implies$ no cycles!
• every node is a descendant of the root

## Tree Terminology

• a node without children is a leaf
• a node that is not a leaf is internal
• depth of a node is its distance to the root
• depth of tree is max depth of any node

## Height

The height of a node is its max distance to a descendent leaf

• height of leaf = 0
• height of internal node is 1 + maximum height of children
• height of tree = height of root

## So Far

• Specified structure of binary trees

• No assumptions about values stored in trees

• Trees are incredibly useful and flexible data structures

• represent hierarchies
• file structure in computer
• dependency of method calls
• representing arithemetic expressions

Next up: represented sorted collections

## Binary Search Trees

Assume values stored in nodes are comparable with $<$

• given (values of) any two nodes $u$ and $v$, have $u < v$, $v < u$, or $v = u$

A tree is a binary search tree (BST) if for every node $v$:

• if $u$ is a left descendant of $v$, then $u < v$
• if $w$ is a right descendant of $v$, then $w > v$

## Searching a BST

How to find(x) in a BST? What is find running time?

## Adding to a BST

How to add(x) in a BST? What is add running time?

## Removing From a BST I

How to remove(y)

… if y is a leaf?

## Removing From a BST II

How to remove(y)

… if y has one child?

## Removing From a BST III

How to remove(y)

… if y has two children?

## Running Times

If $T$ is a tree of height $h$, what is the running time of…

• find?
• add?
• remove?

## Sequence of Ops Determines Structure

Consider $S = \{1, 2, 3, 4, 5\}$. What tree do we get if we add in order $3, 2, 4, 5, 1$? What about $2, 5, 1, 3, 4$?

## What add Sequence Has Max Height?

Assume elements are $1, 2, 3,\ldots,n$…

## Have We Failed?

If:

1. operation sequence determines height, and
2. height can be as large as $n-1$

Then:

• add, remove, find are $O(n)$ in the worst case

This is worse than a sorted array (find is $O(\log n)$)

What can we do about it?

## Restructuring Trees

Idea. When we modify the tree (add or remove), restructure the tree to maintain balance

• use fact that there are many valid BSTs

Challenges.

1. What structure do we want?
• how does structure guarantee efficient operations?
2. How do we check structure/modify to maintain structure?
3. Can we restructure tree efficiently?

## Coming Up

A binary tree $T$ is height balanced or an AVL tree (Adelson-Valsky & Landis) if for every node $v$ with children $u$ and $v$, we have $\vert h(u) - h(v)\vert \leq 1$.

We’ll show:

1. Any AVL tree with $n$ nodes has height $h = O(\log n)$
2. After a single add/remove operation, AVL property can be restored in $O(\log n)$ time

As a result

• AVL trees implement add, remove, and find for sorted sets all in time $O(\log n)$