Lecture 11: Binary Search Trees & Balance
Overview
- Binary Trees
- Binary Search Trees
- (Height) Balanced Binary Trees
Last Time: Binary Tree
A binary tree consists of
- a collection of nodes
- a distinguished node called the root
- each node has
- a parent (null only for root)
- a left child
- a right child
Constraints:
- if
u
is a child of v
, then v
is u
’s parent
- every node has the root as an ancestor
- $\implies$ no cycles!
- every node is a descendant of the root
Tree Terminology
- a node without children is a leaf
- a node that is not a leaf is internal
-
depth of a node is its distance to the root
- depth of tree is max depth of any node
Height
The height of a node is its max distance to a descendent leaf
- height of leaf = 0
- height of internal node is 1 + maximum height of children
- height of tree = height of root
So Far
-
Specified structure of binary trees
-
No assumptions about values stored in trees
-
Trees are incredibly useful and flexible data structures
- represent hierarchies
- file structure in computer
- dependency of method calls
- representing arithemetic expressions
Next up: represented sorted collections
Binary Search Trees
Assume values stored in nodes are comparable with $< $
- given (values of) any two nodes $u$ and $v$, have $u < v$, $v < u$, or $v = u$
A tree is a binary search tree (BST) if for every node $v$:
- if $u$ is a left descendant of $v$, then $u < v$
- if $w$ is a right descendant of $v$, then $w > v$
Searching a BST
How to find(x)
in a BST? What is find
running time?
Adding to a BST
How to add(x)
in a BST? What is add
running time?
Removing From a BST I
How to remove(y)
…
… if y
is a leaf?
Removing From a BST II
How to remove(y)
…
… if y
has one child?
Removing From a BST III
How to remove(y)
…
… if y
has two children?
What is remove
Running Time?
How to Print Elements in Order?
Running Times
If $T$ is a tree of height $h$, what is the running time of…
Sequence of Ops Determines Structure
Consider $S = \{1, 2, 3, 4, 5\}$. What tree do we get if we add in order $3, 2, 4, 5, 1$? What about $2, 5, 1, 3, 4$?
What add
Sequence Has Max Height?
Assume elements are $1, 2, 3,\ldots,n$…
Have We Failed?
If:
- operation sequence determines height, and
- height can be as large as $n-1$
Then:
-
add
, remove
, find
are $O(n)$ in the worst case
This is worse than a sorted array (find
is $O(\log n)$)
What can we do about it?
Restructuring Trees
Idea. When we modify the tree (add
or remove
), restructure the tree to maintain balance
- use fact that there are many valid BSTs
Challenges.
- What structure do we want?
- how does structure guarantee efficient operations?
- How do we check structure/modify to maintain structure?
- Can we restructure tree efficiently?
Coming Up
A binary tree $T$ is height balanced or an AVL tree (Adelson-Valsky & Landis) if for every node $v$ with children $u$ and $v$, we have $\vert h(u) - h(v)\vert \leq 1$.
We’ll show:
- Any AVL tree with $n$ nodes has height $h = O(\log n)$
- After a single add/remove operation, AVL property can be restored in $O(\log n)$ time
As a result
- AVL trees implement
add
, remove
, and find
for sorted sets all in time $O(\log n)$