# Lecture 09: Amortized Analysis and Sorted Sets

## Overview

1. Amortized Analysis
2. Sorted Sets

## Last Time

Considered ArraySimpleStack

public class ArraySimpleStack<E> implements SimpleStack<E> {
private int capacity;
private int size = 0;
private Object[] contents;

...

public void push(E x) {
if (size == capacity) {
increaseCapacity();
}

contents[size] = x;
++size;
}

## increaseCapacity()

private void increaseCapacity() {

// create a new array with larger capacity
Object[] bigContents = new Object[2 * capacity];

// copy contents to bigContents
for (int i = 0; i < capacity; ++i) {
bigContents[i] = contents[i];
}

// set contents to refer to the new array
contents = bigContents;

// update this.capacity accordingly
capacity = 2 * capacity;
}

## Puzzle

What is the running time of

SimpleStack<Integer> stk = new ArraySimpleStack<Integer>();
for (int i = 1; i <= n; i++) {
stk.push(i);
}

## Assessment

SimpleStack<Integer> stk = new ArraySimpleStack<Integer>();
for (int i = 1; i <= n; i++) {
stk.push(i);
}
• Worst case running time of push is $O(n)$
• $n$ pushes have running time $n \cdot O(n) = O(n^2)$
• But vast majority of calls to push are performed in $O(1)$ time
• When we call push $n$ times empirical running time looks like $O(n)$, not $O(n^2)$.

# Amortized Analysis

## Convention

“cost” of an operation $\approx$ running time of operation

## A More Refined Analysis

Idea. Don’t look at worst-case cost of each operation individually

• instead look at worst-case cost of any sequence of operations

• amortized cost is the average cost per operation of any such sequence

## Amortized Analysis, IRL

Cost of living:

• Rent = $1,800 on first of month • Groceries =$100 each week
• Lunch: $5 Income: • I get paid daily$100 (tax free)

Question:

$\implies$ amortized cost of living is (at most) $100 / day ## Example push(x) Assume initially size = capacity = 1 for (int i = 1; i <= n; i++) { stk.push(i); } What are costs of operations? ## Banker’s View of Amortized Analysis • each operation$\mathrm{op}$has associated cost,$\mathrm{cost}(\mathrm{op})$• have an account$A$with balance$\mathrm{bal}(A)$• must maintain$\mathrm{bal}(A) \geq 0$• amortized cost of$\mathrm{op}$is: $$\mathrm{ac}(\mathrm{op}) = \mathrm{cost}(\mathrm{op}) + \mathrm{bal}(A') - \mathrm{bal}(A)$$ •$A$= account before$\mathrm{op}$,$A’$= account after ## Analysis of push public void push(E x) { if (size == capacity) { increaseCapacity(); } contents[size] = x; ++size; } • cost of increaseCapacity when size$= n$is$C_n$• new capacity is$2 n$What is$C_n$in big O notation? ## increaseCapacity() Code private void increaseCapacity() { // create a new array with larger capacity Object[] bigContents = new Object[2 * capacity]; // copy contents to bigContents for (int i = 0; i < capacity; ++i) { bigContents[i] = contents[i]; } // set contents to refer to the new array contents = bigContents; // update this.capacity accordingly capacity = 2 * capacity; } ## Accounting for push public void push(E x) { if (size == capacity) { increaseCapacity(); } contents[size] = x; ++size; } Suppose current capacity is$n$• last resize at$n / 2$Each push until next resize 1. pay$\mathrm{cost}(\texttt{push})$2. add money to account Question. How much to add? ## Question At next increaseCapacity() call, what is account balance? How to pay$C_n$for increaseCapacity()? ## Final Analysis If$n/2$was last resize, each push until size is$n$: 1. pay cost of push 2. add$C_n / (n/2) = 2 C_n / n$to account On push when size is$n$1. pay cost of push 2. remove$C_n$from$A$to pay for increaseCapacity() In both scenarios $$\mathrm{ac}(\mathrm{op}) = \mathrm{cost}(\mathrm{op}) + \mathrm{bal}(A') - \mathrm{bal}(A) = O(1)$$ ## So We Should Have Expected ## More Generally Amortized complexity is a measure of average cost of operations when averaged over any sequence of operations Moral. Even if individual operations can be expensive, if expensive operations are infrequent, then a data structure may still be efficient. # Sorted Sets ## Thought Experiment • Oxford English Dictionary (OED) • contains 300,000 entries • 20 volumes, 21,000+ pages • includes history and earliest known usage of words • Complete Works of Shakespeare • 1 volume • 1,300 pages Question. I read that Shakespeare coined the term indistinguishable. Would it be faster to search OED or Shakespeare to see if Shakespeare used indistinguishable? ## Another Question What makes searching a dictionary preferable to search a novel for a word? ## Previously You’ve considered 2 set implementations 1. LinkedSimpleUSet • elements stored in order first read 2. MTFSimpleUSet • elements stored in order last accessed Time to access an element is proportional to its position in list • this is the best we can do for a linked list ## Faster Finding In dictionary example: • words are sorted alphabetically • to search, start at middle of book • because of sorting, know to jump forward or backward ## Data Structure? A set that stores elements in sorted order • assume elements can be sorted in some way What data structure allows us to “jump” as in searching a dictionary? ## Formalizing an ADT Sorted Set ADT • all of the functionality of a SimpleUSet • add, remove, find • assumes elements can be compared • for any two elements$x, y$, have$x = y$,$x > y$, or$x < y\$
• findMin
• findMax
• slightly different behavior
• find(x) returns the smallest element y in the set that is no larger than x (null if no such y)

## The Comparable Interface

To indicate that elements of class E can be compared, E must implement the Comparable<E> interface:

public interface Comparable<T> {
int compareTo(T o);
}

This interface is built in to Java!

Interpretation:

• x.compareTo(y) < 0 indicates that x is “smaller than” y
• x.compareTo(y) > 0 indicates that x is “larger than” y
• x.compareTo(y) == 0 indicates that x is semantically equivalent to y
• should have x.compareTo(y) == 0 if and only if x.equals(y)

## Example Integer

public class Integer implements Comparable<Integer> {
private int value;

@Override
int compareTo(Integer x) {
return value - x.value;
}

@Override
boolean equals(Object o) {
if (!(o instanceOf Integer)) return false;
Integer x = (Integer) o;
return (value == x.value);
}
}

## Sorted Set Interface

public interface SimpleSSet<E extends Comparable<E>> extends SimpleUSet<E>  {

@Override /* comments explain how find differs from parent method */
E find(E x);

E findMin();

E findMax();
}

## Implementing SimpleSSet

Question. What data structure should we use to store elements?

remove(x)?

find(x)?

## Common Functionality

Find the index where x would be located

• unambiguous because set is sorted, elements are unique

Define int getIndex(x) method

How to find(x)?

## Observation

Once getIndex is implemented add, remove, find can be made to work

• alternative (correct) implementations of getIndex will not affect add/remove/find code

Suggestion. First implement/test with simple getIndex method, then design/test more sophisticated getIndex implementations

Maxim. Premature optimization is the root of all evil.

• Tony Hoare (popularized by Donald Knuth)

See code!

## More Efficient getIndex

How to search a sorted array like a dictionary?

## Recursive getIndex()

• See: ArraySimpleSSet.java

## USet and SSet Find Running Times

Compare running times of find between unordered set implementation (ArraySimpleUSet) and ArraySimpleSSet with binary search

• With linear search, implementations have almost exact same running time because elements were added in sorted order!

## Next Time

• More discussion of recursion
• More detailed analysis of binary search