# Lecture 11: Binary (Search) Trees

## Overview

1. Review of Binary Search
2. Binary Trees
3. Binary Search Trees

## Last Time

Searching a Sorted Array: find(18)

       0  1  2  3  4   5   6   7   8   9   10  11  12  13  14  15

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]


## Binary Search Method

// search arr for value x between indices i and j
binarySearch(int[] arr, int x, int i, int j) {
if (j == i + 1) {return arr[i];}

int k = (i + j) / 2;

if (arr[k] <= x) {

return binarySearch(arr, x, k, j);

} else {

return binarySearch(arr, x, i, k);

}
}


## Lingering Question

Binary search allows us to find elements in a sorted array quickly:

• $O(\log n)$ time versus $O(n)$ time previously

Sorted arrays are still costly to modify:

• add method is $O(n)$, worst case
• remove method is $O(n)$, worst case

Question. Can we perform all operation efficiently?

## Comparing Arrays and Linked Lists

Array:

• gives $O(1)$ access by index
• allows “jumping” for binary search
• cost of “random” access: modifications move elements
• add/remove are $O(n)$

• searching is $O(n)$ worst case
• once location is determined, modification is $O(1)$

What is the array access pattern of binary search?

• which indices are accessed first?
• which indices are accessed second?
       0  1  2  3  4   5   6   7   8   9   10  11  12  13  14  15

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]


## Observation

Binary search accesses indices hierarchically

• index $n/2$ is always first
• indices $n/4$, $3n/4$ are always second
• indeices $n/8$, $3n/8$, $5n/8$, $7n/8$ are always third

Idea:

• store elements hierarchically, rather than linearly
• arrays store elements consecutively
• maintaining sortedness means moving entries around for each add/remove

Want: more flexibility to add/remove elements

## Hierarchical Picture

       0  1  2  3  4   5   6   7   8   9   10  11  12  13  14  15

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]


## More Formally

Like a linked list, store elements in associated nodes

Unlike a linked list, references no longer form a path

Each node has:

• a left child
• a right child
• a parent
• a comparable value

Keep track of root node (top of hierarchy)

## Sorted Set Example

$S = \{2, 3, 5, 7, 13, 15, 17, 19\}$

## How to Find?

Given previous structure, how do we find(11)?

Given previous structure, how do we add(11)?

## How to Remove?

Given previous structure, how do we remove(2)?

## How to Remove?

Given previous structure, how do we remove(15)?

## Formalizing Things

A binary tree consists of

• a collection of nodes
• a distinguished node called the root
• each node has
• a parent (null only for root)
• a left child
• a right child

Constraints:

1. if u is a child of v, then v is u’s parent
2. every node has the root as an ancestor
• $\implies$ no cycles!
• every node is a descendant of the root

## Tree Terminology

• a node without children is a leaf
• a node that is not a leaf is internal
• depth of a node is its distance to the root
• depth of tree is max depth of any node

## Height

The height of a node is its max distance to a descendent leaf

• height of leaf = 0
• height of internal node is 1 + maximum height of children
• height of tree = height of root

## So Far

• Specified structure of binary trees

• No assumptions about values stored in trees

• Trees are incredibly useful and flexible data structures

• represent hierarchies
• file structure in computer
• dependency of method calls
• representing arithemetic expressions

Next up: represented sorted collections

## Binary Search Trees

Assume values stored in nodes are comparable with $<$

• given (values of) any two nodes $u$ and $v$, have $u < v$, $v < u$, or $v = u$

A tree is a binary search tree (BST) if for every node $v$:

• if $u$ is a left descendant of $v$, then $u < v$
• if $w$ is a right descendant of $v$, then $w > v$

## Searching a BST

How to find(x) in a BST? What is find complexity?

How to add(x) in a BST? What is add complexity?

## Removing From a BST I

How to remove(y)

… if y is a leaf?

## Removing From a BST II

How to remove(y)

… if y has one child?

## Removing From a BST III

How to remove(y)

… if y has two children?

## How to Find Next Largest?

Given node $v$ in BST $T$, what node stores the next largest value?