Lecture 08: Asymptotic and Amortized Analysis

«««< HEAD

Announcement

No programming assignment next week
Midterm exam next week

Format:

  • Take home, open note/open book
    • no “interactive” resources
  • $\sim 4$ questions
    • not longer than in-class exam

Overview

  1. Big O Notation
  2. Amortized Analysis

    Overview

  3. Big O Notation
  4. 6f773b84df4a423f46d7877c931f9b87a6a230e5

Last Time

Consider running time $\sim$ # elementary operations

  • cannot know the cost of each individual operation
  • trend (running time vs instance size) should not depend on cost of individual operations
  • measure of running time that:
    • ignores constant factors
    • ignores lower order terms

Big O Notation

$f, g$ are functions from natural numbers $\mathbf{N}$ to reals $\mathbf{R}$

E.g.,

  • $n$ = input size
  • $f(n)$ = worst case running time on a particular computer or number of elementary operations

Informally Write $f = O(g)$ to mean “$f$ scales no faster than $g$”

  • much weaker than $f \leq g$ because scaling ignores constants

Big O, Formally

Definition. $f, g : \mathbf{N} \to \mathbf{R}^+$ We write $f = O(g)$ (read: “$f$ is (big) O of $g$”) if there exists a natural number $N$ and a constant $C$ in $\mathbf{R}^+$ such that for all $n \geq N$, we have

$$f(n) \leq C \cdot g(n)$$

Example 1: List Add at Front

Example 2: List Add at Random

Properties of $O$

  1. if $f(n) \leq c$ for all $n$ ($c$ constant), then $f = O(1)$

  2. if $f(n) \leq g(n)$ for all $n$, then $f = O(g)$

  3. if $f = O(g)$, then for all constants $c$, $c f = O(g)$

  4. if $f, h = O(g)$ then $f + h = O(g)$

  5. if $f_1 = O(g_1)$ and $f_2 = O(g_2)$, then $f_1 \cdot f_2 = O(g_1 \cdot g_2)$

Conesequnce:

  • if $a \leq b$, then $n^a = O(n^b)$

Example

Show: $10 n^2 + 100 n + 1000 = O(n^2)$

Running Time Analysis

Assumptions

  1. primitive operations take time $O(1)$

  2. initializing objects of size $n$ (primitive data types) takes time $O(n)$

What is Running Time of add(i, x)?

    public void add(int i, E x) {	
	if (i < 0 || i > size) { throw new IndexOutOfBoundsException();}
	Node<E> nd = new Node<E>();
	nd.value = x;

	if (i == 0) {
	    nd.next = this.head;
	    this.head = nd;
	} else {
	    Node<E> pred = getNode(i - 1);
	    Node<E> succ = pred.next;
	    pred.next = nd;
	    nd.next = succ;
	}

	++size;
    }

Running Time of getNode(i)?

    private Node<E> getNode(int i) {
	// check if i is a valid index
	if (i < 0 || i >= size) return null;
	
	Node<E> cur = head;

	// find the i-th successor of the head
	for (int j = 0; j < i; ++j) {
	    cur = cur.next;
	}

	return cur;	
    }

What About ArraySimpleStack?

public class ArraySimpleStack<E> implements SimpleStack<E> {
    private int capacity;
    private int size = 0;
    private Object[] contents;
	
	...

    public void push(E x) {
	if (size == capacity) {
	    increaseCapacity();
	}

	contents[size] = x;
	++size;
    }

increaseCapacity()?

    private void increaseCapacity() {
	
    	// create a new array with larger capacity
    	Object[] bigContents = new Object[2 * capacity];

    	// copy contents to bigContents
    	for (int i = 0; i < capacity; ++i) {
    	    bigContents[i] = contents[i];
    	}

    	// set contents to refer to the new array
    	contents = bigContents;

    	// update this.capacity accordingly
    	capacity = 2 * capacity;
    }

Puzzle

What is the running time of

    SimpleStack<Integer> stk = new ArraySimpleStack<Integer>();
	for (int i = 1; i <= n; i++) {
	    stk.push(i);
	}

Does the analysis look right?

Time per Operation

Ignore Outliers

Assessment

    SimpleStack<Integer> stk = new ArraySimpleStack<Integer>();
	for (int i = 1; i <= n; i++) {
	    stk.push(i);
	}
  • Worst case running time of push is $O(n)$
    • $n$ pushes have running time $n \cdot O(n) = O(n^2)$
  • But vast majority of calls to push are performed in $O(1)$ time
  • When we call push $n$ times empirical running time looks like $O(n)$, not $O(n^2)$.

Amortized Analysis

Convention

“cost” of an operation $\approx$ running time of operation

A More Refined Analysis

Idea. Don’t look at worst-case cost of each operation individually

  • instead look at worst-case cost of any sequence of operations

  • amortized cost is the average cost per operation of any such sequence

Amortized Analysis, IRL

Cost of living:

  • Rent = $1,800 on first of month
  • Groceries = $100 each week
  • Lunch: $5

«««< HEAD Income:

  • I get paid daily $100 (tax free)

Question:

  • Some days, I need to pay $1,905… how can I afford to live on 100 a day?!?

    I get paid daily $100 (tax free)

Some days, I need to pay $1,905… how can I afford to live on $100 a day?!?

6f773b84df4a423f46d7877c931f9b87a6a230e5

Banker’s View

Open a bank account!

«««< HEAD Each day, can do: ======= Each day, can do

6f773b84df4a423f46d7877c931f9b87a6a230e5

  1. pay expense out of pocket (from day’s pay)
  2. deposit money into bank account
  3. withdraw money from bank account to pay expense

«««< HEAD I can afford to live off $100 a day if every day I can pay that day’s expenses and maintain non-negative bank account balance

$\implies$ amortized cost of living is (at most) $100 / day

6f773b84df4a423f46d7877c931f9b87a6a230e5

Example push(x)

Assume initially size = capacity = 1

	for (int i = 1; i <= n; i++) {
	    stk.push(i);
	}

What are costs of operations?

Banker’s View of Amortized Analysis

«««< HEAD

  • each operation $\mathrm{op}$ has associated cost, $\mathrm{cost}(\mathrm{op})$
  • have an account $A$ with balance $\mathrm{bal}(A)$
  • amortized cost of $\mathrm{op}$ is:
$$\mathrm{ac}(\mathrm{op}) = \mathrm{cost}(\mathrm{op}) + \mathrm{bal}(A') - \mathrm{bal}(A)$$
  • $A$ = account before $\mathrm{op}$, $A’$ = account after

Analysis of push

    public void push(E x) {
	if (size == capacity) {
	    increaseCapacity();
	}

	contents[size] = x;
	++size;
    }
  • cost of increaseCapacity when size $= n$ is $C_n$
    • new capacity is $2 n$

What is $C_n$ in big O notation?

increaseCapacity() Code

    private void increaseCapacity() {
	
    	// create a new array with larger capacity
    	Object[] bigContents = new Object[2 * capacity];

    	// copy contents to bigContents
    	for (int i = 0; i < capacity; ++i) {
    	    bigContents[i] = contents[i];
    	}

    	// set contents to refer to the new array
    	contents = bigContents;

    	// update this.capacity accordingly
    	capacity = 2 * capacity;
    }

Accounting for push

    public void push(E x) {
	if (size == capacity) {
	    increaseCapacity();
	}

	contents[size] = x;
	++size;
    }

Suppose current capacity is $n$

  • last resize at $n / 2$

Each push until next resize

  1. pay $\mathrm{cost}(\texttt{push})$
  2. add money to account

Question. How much to add?

Question

At next increaseCapacity() call, what is account balance?

How to pay $C_n$ for increaseCapacity()?

Final Analysis

If $n/2$ was last resize, each push until size is $n$:

  1. pay cost of push
  2. add $C_n / (n/2) = 2 C_n / n$ to account

On push when size is $n$

  1. pay cost of push
  2. remove $C_n$ from $A$ to pay for increaseCapacity()

In both scenarios

$$\mathrm{ac}(\mathrm{op}) = \mathrm{cost}(\mathrm{op}) + \mathrm{bal}(A') - \mathrm{bal}(A) = O(1)$$

So We Should Have Expected

=======

  • each operation $\mathrm{op}$ has associated cost, $\mathrm{cost}(\mathrm{op})$
  • 6f773b84df4a423f46d7877c931f9b87a6a230e5