# Amortized Analysis

Understanding the running time of a sequence of operations

In *Asymptotic Analysis and Big O Notation*, we introduced a method of quantifying the running time of a procedure in terms of its worst-case performance on all inputs of a given size. Such an analysis is helpful, for example, in comparing the relative performance of two procedures that perform the same task. Yet in many instances, the *worst-case* running time of a procedure does not accurately reflect the true running time in practice. In particular, in the study of data structures, we are often interested in the time to perform a *sequence* of operations, rather than the worst-case time to perform a single operation in isolation. In this note, we describe a method of analysis called *amortized analysis* that considers the running time of a sequence of operations with the aim of understanding the average running time of the operations when averaged over the entire sequence.

For a concrete example, consider the following two implementations of the `SimpleStack`

interface:

Both implementations use an array, `Object[] contents`

, to store the contents of the stack. In both cases, when the array is full, a call to the `push`

method copies `contents`

to a larger array thereby increasing the capacity of the data structure. The only difference between the two implementations is how this capacity increase is performed. `ArraySimpleStackOne`

simply increases the capacity by one to make room for the single new item being pushed to the stack. Here is the code for the `increaseCapacity()`

method for `ArraySimpleStackOne`

:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

private void increaseCapacity() {
// create a new array with larger capacity
Object[] bigContents = new Object[capacity + 1];
// copy contents to bigContents
for (int i = 0; i < capacity; ++i) {
bigContents[i] = contents[i];
}
// set contents to refer to the new array
contents = bigContents;
// update this.capacity accordingly
capacity = capacity + 1;
}

On the other hand, `ArraySimpleStackTwo`

*doubles* the capacity of `contents`

each time the array’s size needs to be increased:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

private void increaseCapacity() {
// create a new array with larger capacity
Object[] bigContents = new Object[2 * capacity];
// copy contents to bigContents
for (int i = 0; i < capacity; ++i) {
bigContents[i] = contents[i];
}
// set contents to refer to the new array
contents = bigContents;
// update this.capacity accordingly
capacity = 2 * capacity;
}

If the stack has size \(n\), both implementations of `increaseCapacity()`

have running time \(O(n)\).

The `push`

method for the two implementations of `SimpleStack`

are identical:

1
2
3
4
5
6
7
8

public void push(E x) {
if (size == capacity) {
increaseCapacity();
}
contents[size] = x;
++size;
}

Note that the running time of this method is \(O(1)\) if `size != capacity`

(i.e, the array does not need to be resized), but is only \(O(n)\) when a resize occurs.

Consider now the running time of pushing \(n\) elements to a stack. For example we might have

1
2
3
4
5
6

SimpleStack<Integer> stack;
...
for (int i = 0; i < n; ++i) {
stack.push(someValue);
someValue = nextValue();
}

A straightforward application of asymptotic analysis shows that for both implementations of `SimpleStack`

above, the running time of the code snippet above is $O(n^2)$ (assuming `nextValue()`

has running time \(O(n)\)): the `stack.push(...)`

method runs in time \(O(n)\), and this method is called \(n\) times in the loop above. Yet running the same program with the two implementations of `increaseCapacity()`

can give drastically different running times. Here are some example running times of the same program building a stack of size \(n\) using the two different `SimpleStack`

implementations:

Here are the running times just for `ArraySimpleStackTwo`

to get a better picture of what is happening there:

Why is `ArraySimpleStackTwo`

so much more efficient when both implementations’ `add`

methods have the same worst case running time?

The answer lies in how *frequently* the `increaseCapacity()`

method (which dominates the running time of `add`

) is called. In `ArraySimpleStackOne`

, this method only increases the capacity by one. Thus, after we’ve increased the capacity for one `push`

, we might need to do it again on the next `push`

. On the other hand, `ArraySimpleStackTwo`

doubles the capacity of the array on each resize. Therefore, if a call to `push`

invokes `increaseCapacity`

when the size of the stack is \(n\), the next call to `increaseCapacity`

will not occur until the size reaches \(2 n\). That is, we will have to perform another (at least) \(n\) calls to `push`

before another capacity increase.

In what follows, we will define the *amortized* cost of a method call. The basic idea of the method is to come up with an “accounting scheme” for the running times of method calls such that the running time of expensive (i.e., time-consuming) operations—such as `increaseCapacity()`

—can be averaged out over less expensive method calls. Using amortized analysis, we will show that the amortized running time of the `push`

method for `ArraySimpleStackTwo`

is \(O(1)\). Thus, when averaged over any *sequence* of method calls, the average `push`

runs in time \(O(1)\), whereas `ArraySimpleStackOne`

still runs only in \(O(n)\) time. Thus, we can explain the enormous difference in the running times depicted above.

## The Banker’s View

In this section, we provide a definition of the amortized running time of a method via an accounting scheme. This view of amortized analysis is sometimes referred to as the “banker’s view” of amortized analysis. To this end, we associate a **cost** to each operation \(\mathrm{op}\) that represents the operation’s running time. To perform an operation, the cost can be paid in two ways: either by paying the cost upfront, or by deducting (a portion of) the cost from an **account** \(A\).

Each time an operation is performed, we can choose either to pay the cost (in part) from \(A\) thereby decreasing its balance, or pay some *extra* cost upfront to increase the balance of \(A\). In either case **amortized cost** of performing the operation \(\mathrm{op}\) is

Here \(\mathrm{ac}\) denotes amortized cost, while \(\mathrm{bal}(A')\) and \(\mathrm{bal}(A)\) denote, respectively, the balance of \(A\) before and after performing the operation. Thus \(\mathrm{bal}(A') - \mathrm{bal}(A)\) is positive if we pay additional funds into \(A\) (in addition to paying \(\mathrm{cost}(\mathrm{op})\) upfront), and this value is negative if we use funds from \(A\) in order to pay (part of) the cost of \(\mathrm{op}\).

**Proposition.** Suppose there is an accounting scheme as above such that for any sequence of operations \(\mathrm{op}_1, \mathrm{op}_2, \ldots, \mathrm{op}_m\) the following hold:

- for all \(i\), we have \(\mathrm{ac}(\mathrm{op}_i) \leq C\), and
- the balance satisfies \(\mathrm{bal}(A) \geq 0\) before and after every operation.

Then the total running time of the sequence of operations is at most \(m \cdot C\). Thus, the average running time per operation for any sequence of operations is at most \(C\).

## Amortizing `ArraySimpleStackTwo`

Here, we show how to apply amortized analysis to the `push`

method of `ArraySimpleStackTwo`

. Suppose a call to `increaseCapacity()`

occurs during a `push`

operation where the previous capacity was \(N\). That is, during that call to `push`

, the capacity was increased to \(2 N\), and the size of the stack increased to \(N + 1\). Thus, the next call to `increaseCapacity()`

(if any) will occur during a call to `push`

after the size of the stack reaches \(2 N\). Thus, a total of at least \(N\) calls to \(\mathrm{push}\) must be made before the next capacity increase.

Denote running time (cost) of the next call to `increaseCapacity()`

by \(C_{2N}\). By our worst-case analysis of the `increaseCapacity()`

method, we have \(C_{2N} = O(N)\). The idea of our accounting scheme is to do the following: for each `push`

operation before the size reaches \(2 N\), pay \(\frac{1}{N} C_{2N} = O(1)\) additional into the account \(A\) to increase its balance. This way, when the stack reaches size \(2 N\) (after at least \(N\) calls to `push`

), the account balance is at least \(C_{2N}\). Thus, the cost of `increaseCapacity()`

can be paid out of the balance.

Using this accounting scheme, the for subsequent calls to `push`

in which the stack size is less than \(2 N\) incur an amortized cost of

Here, $\mathrm{cost}(\texttt{push}) = O(1)$ because these calls do not call `increaseSize()`

. For the next call to `push`

(if any) in which the stack size is \(2 N\), we have

Again, the account \(A\) satisfies \(\mathrm{bal}(A) \geq (N - 1) \frac{1}{N} C_{2N}\) before the operation because at least \(N - 1\) previous calls to `push`

were made. Thus, in all cases, the amortized cost of `push`

is \(O(1)\).