Overview

  1. Sorted Sets
  2. Binary Search
  3. Efficiency of Binary Search
  4. Implementing Binary Search

Last Time

Thought experiment: searching ordered vs unordered sets

  • ordered is easier because you know if desired element is before or after current location
  • order only helps if you have “random access” (e.g., array)

Binary Search Idea

Assume:

  • elements can be accessed by index (e.g., array)
  • elements are sorted in increasing order

To search for a value x

  1. Look at middle index i of array

  2. if arr[i] is smaller than x, search right half of array; otherwise search left half

  3. recursively search the half of array determined in step 2

    • compare x to midpoint of sub-interval
    • search left or right half of sub-interval depending on relative value of x and value found

Example: find(51)

       0  1  2  3  4   5   6   7   8   9   10  11  12  13  14  15
	   
arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53] 
arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53] 
arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53] 
arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53] 

Binary Search, More Formally

A Single Step:

  • determine next sub-interval to search

Input:

  • x element to be found
  • arr a sorted array storing elements
  • i, j indices with i < j
  • search for x in range arr[i], arr[i+1], ..., arr[j-1]

Next Step:

  • if i = j - 1, determine if x is at arr[i]
  • otherwise pick next interval i', j' to search

Binary Search Method

// search arr for value x between indices i and j
binarySearch(int[] arr, int x, int i, int j) {
    if (j == i + 1) {return arr[i];}
	
    int k = (i + j) / 2;	
	
    if (arr[k] <= x) {
	
        return binarySearch(arr, x, k, j);
		
    } else {
	
        return binarySearch(arr, x, i, k);
		
    }
}

Example

binarySearch(int[] arr, int x, int i, int j) {
    if (j == i + 1) {return arr[i];}	
    int k = (i + j) / 2;	
    if (arr[k] <= x) {return binarySearch(arr, x, k, j);
    } else { return binarySearch(arr, x, i, j);}
}
       0  1  2  3  4   5   6   7   8   9   10  11  12  13  14  15
	   
arr = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53] 

find(51) -> binarySearch(arr, 51, 0, 16)

Efficiency of Binary Search, Size $n$

  • initial call to binarySearch(arr, x, 0, n) has $j - i = n$

  • next recursive call is binarySearch(arr, x, i, j) with $i - j = n / 2$

  • each subsequent recursive call cuts size of region in half

  • in $k$th call, $i - j = n / 2^k$

  • value returned when $i - j = 1$

    • $k$ satisfying $n / 2^k = 1$

    • $\implies 2^k = n$

    • $\implies k = \log n$

  • Each recursive call takes $O(1)$ time, so overall running time is $O(\log n)$

How Fast is $O(\log n)$?

Observe:

  • $2^{10} \approx 1,000 \implies \log (1000) \approx 10$
  • $2^{20} \approx 1,000,000 \implies \log (1,000,000) \approx 20$
  • $2^{30} \approx 1,000,000,000 \implies \log (1,000,000,000) \approx 30$

Implementing Binary Search in Java

Sorted Set ADT

  • all of the functionality of a SimpleUSet
    • add, remove, find
  • assumes elements can be compared
    • for any two elements $x, y$, have $x = y$, $x > y$, or $x < y$
  • additional methods
    • findMin
    • findMax
  • slightly different behavior
    • find(x) returns the smallest element y in the set that is no smaller than x (null if no such y)

The Comparable Interface

To indicate that elements of class E can be compared, E must implement the Comparable<E> interface:

public interface Comparable<T> {
    int compareTo(T o);
}

This interface is built in to Java!

Interpretation:

  • x.compareTo(y) < 0 indicates that x is “smaller than” y
  • x.compareTo(y) > 0 indicates that x is “larger than” y
  • x.compareTo(y) == 0 indicates that x is semantically equivalent to y
    • should have x.compareTo(y) == 0 if and only if x.equals(y)

Example Integer

public class Integer implements Comparable<Integer> {
    private int value;

    @Override
    int compareTo(Integer x) {
        return value - x.value;
    }
	
    @Override
    boolean equals(Object o) {
        if (!(o instanceOf Integer)) return false;
        Integer x = (Integer) o;		
        return (value == x.value);
    }	
}

Sorted Set Interface

public interface SimpleSSet<E extends Comparable<E>> extends SimpleUSet<E>  {

    @Override /* comments explain how find differs from parent method */
    E find(E x);


    E findMin();

    
    E findMax();
}

Implementing SimpleSSet

Question. What data structure should we use to store elements?

How to…

add(x)?

remove(x)?

find(x)?

Common Functionality

Find the index where x would be located

  • unambiguous because set is sorted, elements are unique

Define int getIndex(x) method

How to find(x)?

How to add(x)?

Observation

Once getIndex is implemented add, remove, find can be made to work

  • alternative (correct) implementations of getIndex will not affect add/remove/find code

Suggestion. First implement/test with simple getIndex method, then design/test more sophisticated getIndex implementations

Maxim. Premature optimization is the root of all evil.

  • Tony Hoare (popularized by Donald Knuth)

ArraySimpleSSet Implementation

See code!

  • simple implementation of getIndex first
  • then binary search implementation

USet and SSet Find Running Times

Compare running times of find between unordered set implementation (ArraySimpleUSet) and ArraySimpleSSet with binary search

  • With linear search, implementations have almost exact same running time because elements were added in sorted order!

Time to Find: USet vs SSet

Time to Find SSet

Time to Find SSet (Long)