## Overview

1. Sorted Sets
2. Binary Search
3. Efficiency of Binary Search
4. Implementing Binary Search

## Last Time

Thought experiment: searching ordered vs unordered sets

• ordered is easier because you know if desired element is before or after current location
• order only helps if you have “random access” (e.g., array)

## Binary Search Idea

Assume:

• elements can be accessed by index (e.g., array)
• elements are sorted in increasing order

To search for a value x

1. Look at middle index i of array

2. if arr[i] is smaller than x, search right half of array; otherwise search left half

3. recursively search the half of array determined in step 2

• compare x to midpoint of sub-interval
• search left or right half of sub-interval depending on relative value of x and value found

## Example: find(51)

       0  1  2  3  4   5   6   7   8   9   10  11  12  13  14  15

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]


## Binary Search, More Formally

A Single Step:

• determine next sub-interval to search

Input:

• x element to be found
• arr a sorted array storing elements
• i, j indices with i < j
• search for x in range arr[i], arr[i+1], ..., arr[j-1]

Next Step:

• if i = j - 1, determine if x is at arr[i]
• otherwise pick next interval i', j' to search

## Binary Search Method

// search arr for value x between indices i and j
binarySearch(int[] arr, int x, int i, int j) {
if (j == i + 1) {return arr[i];}

int k = (i + j) / 2;

if (arr[k] <= x) {

return binarySearch(arr, x, k, j);

} else {

return binarySearch(arr, x, i, k);

}
}


## Example

binarySearch(int[] arr, int x, int i, int j) {
if (j == i + 1) {return arr[i];}
int k = (i + j) / 2;
if (arr[k] <= x) {return binarySearch(arr, x, k, j);
} else { return binarySearch(arr, x, i, j);}
}

       0  1  2  3  4   5   6   7   8   9   10  11  12  13  14  15

arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]


find(51) -> binarySearch(arr, 51, 0, 16)

## Efficiency of Binary Search, Size $n$

• initial call to binarySearch(arr, x, 0, n) has $j - i = n$

• next recursive call is binarySearch(arr, x, i, j) with $i - j = n / 2$

• each subsequent recursive call cuts size of region in half

• in $k$th call, $i - j = n / 2^k$

• value returned when $i - j = 1$

• $k$ satisfying $n / 2^k = 1$

• $\implies 2^k = n$

• $\implies k = \log n$

• Each recursive call takes $O(1)$ time, so overall running time is $O(\log n)$

## How Fast is $O(\log n)$?

Observe:

• $2^{10} \approx 1,000 \implies \log (1000) \approx 10$
• $2^{20} \approx 1,000,000 \implies \log (1,000,000) \approx 20$
• $2^{30} \approx 1,000,000,000 \implies \log (1,000,000,000) \approx 30$

# Implementing Binary Search in Java

## Sorted Set ADT

• all of the functionality of a SimpleUSet
• add, remove, find
• assumes elements can be compared
• for any two elements $x, y$, have $x = y$, $x > y$, or $x < y$
• findMin
• findMax
• slightly different behavior
• find(x) returns the smallest element y in the set that is no smaller than x (null if no such y)

## The Comparable Interface

To indicate that elements of class E can be compared, E must implement the Comparable<E> interface:

public interface Comparable<T> {
int compareTo(T o);
}


This interface is built in to Java!

Interpretation:

• x.compareTo(y) < 0 indicates that x is “smaller than” y
• x.compareTo(y) > 0 indicates that x is “larger than” y
• x.compareTo(y) == 0 indicates that x is semantically equivalent to y
• should have x.compareTo(y) == 0 if and only if x.equals(y)

## Example Integer

public class Integer implements Comparable<Integer> {
private int value;

@Override
int compareTo(Integer x) {
return value - x.value;
}

@Override
boolean equals(Object o) {
if (!(o instanceOf Integer)) return false;
Integer x = (Integer) o;
return (value == x.value);
}
}


## Sorted Set Interface

public interface SimpleSSet<E extends Comparable<E>> extends SimpleUSet<E>  {

@Override /* comments explain how find differs from parent method */
E find(E x);

E findMin();

E findMax();
}


## Implementing SimpleSSet

Question. What data structure should we use to store elements?

## How to…

add(x)?

remove(x)?

find(x)?

## Common Functionality

Find the index where x would be located

• unambiguous because set is sorted, elements are unique

Define int getIndex(x) method

How to find(x)?

How to add(x)?

## Observation

Once getIndex is implemented add, remove, find can be made to work

• alternative (correct) implementations of getIndex will not affect add/remove/find code

Suggestion. First implement/test with simple getIndex method, then design/test more sophisticated getIndex implementations

Maxim. Premature optimization is the root of all evil.

• Tony Hoare (popularized by Donald Knuth)

## ArraySimpleSSet Implementation

See code!

• simple implementation of getIndex first
• then binary search implementation

## USet and SSet Find Running Times

Compare running times of find between unordered set implementation (ArraySimpleUSet) and ArraySimpleSSet with binary search

• With linear search, implementations have almost exact same running time because elements were added in sorted order!