Will Rosenbaum | Lecture 03: List Performance

Last week, we discussed the List abstract data type (ADT) that represents a list of elements. We examined two natural ways we could represent a list: an array and linked list. In this note, we will consider the efficiency of one implementation, using an array. You can follow along with the code provided in lec03.zip.

Recall that the List specifies the following operations:

size()
isEmpty()
get(i)
set(i, y)
add(i, y)
remove(i)

See the List-like ADTs notes for a formal description of these operations. In an array-based implementation of a List, we can store the list’s elements at consecutive indices in an array. In ArraySimpleList.java we have:

public class ArraySimpleList<E> implements SimpleList<E> {

    private Object[] contents;

    ...
	
}

where the element at index 0 is stored in contents[0], the element at index 1 is stored in contents[1], and so on. For such an implementation, the get(i) method is straightforward to implement, because we can simply return contents[i], cast as the appropriate datatype (E). Setting a value is also simple: set(i, y) should simply assign contents[i] = y (after checking that i is a valid index between 0 and one less than the list’s size).

The operations for add(i, y) and remove(i) require a bit more work, since these operations change the indices of elements at indices j >= i. Moreover, these operations change the size of the list, while arrays in Java have fixed size. A standard way of dealing with the fixed-sized contraint of arrays is to create an array of some default size, and creating larger arrays if necessary, copying the contents of the smaller array to the larger array. For example, ArraySimpleList does this via the following method:

    public void add(int i, E x) {
	// i is a valid index if it is between 0 and size
	if (i > size || i < 0) {
	    throw new IndexOutOfBoundsException();
	}

	// check if we need to increase the capacity before inserting
	// the element
	if (size == capacity) {
	    increaseCapacity();
	}

	++size;

	// insert x by setting contents[i] to x and moving each
	// element previously at index j >= i to index j + 1.
	Object cur = x;
	for (int j = i; j < size; ++j) {
	    Object next = contents[j];
	    contents[j] = cur;
	    cur = next;
	}
    }

The increaseCapacity() method handles the actual “resizing” of the array. A simple implementation of this method is simply to make a new array that is one larger than the previous array:

    private void increaseCapacity() {
	
	// create a new array with larger capacity
	Object[] bigContents = new Object[capacity + 1];

	// copy contents to bigContents
	for (int i = 0; i < capacity; ++i) {
	    bigContents[i] = contents[i];
	}

	// set contents to refer to the new array
	contents = bigContents;

	// update this.capacity accordingly
	capacity = capacity + 1;
    }

Together, the two methods above ensure that the array always has sufficient capacity to add the new element. Note that the capacity of the array (i.e., contents.length) is not necessarily equal to the size of the list; we keep track of a separate variable size that stores the number elements actually in the list, and the elements are stored at indices 0, 1, ..., size-1 in contents.

The program ListBuildTimer.java measures the performance of our list implementation. Specifically, for various sizes, the program measures the amount of time needed to build an ArraySimpleList by repeatedly calling the add method, appending elements to the end of the list. Here is a chart of the performance on my computer:

To get a better sense of what is going on, here is a chart of the time per operation (i.e., total build time divided by size):

Observe that the time per operation increases as a function of the size of the list.

Question 1. Look again at the implementation of the add method in ArraySimpleList.java and the test performed by ListBuildTimer.java. Why would you expect that the time to add to a large array would increase with the size of the list?

Now consider the alternative implementation of the increaseCapacity() method:

    private void increaseCapacity() {
	
    	// create a new array with larger capacity
    	Object[] bigContents = new Object[2 * capacity];

    	// copy contents to bigContents
    	for (int i = 0; i < capacity; ++i) {
    	    bigContents[i] = contents[i];
    	}

    	// set contents to refer to the new array
    	contents = bigContents;

    	// update this.capacity accordingly
    	capacity = 2 * capacity;
    }

The only difference is that this method doubles the capacity of the array whenever it needs to be increased (rather than increasing the capacity by 1). When comparing the running times of the two implementations, I get the following output:

All I did was change two lines of increaseCapacity(), and suddenly the time to build a list of 10,000 elements drops from 150ms to less than 1ms!

Question 2. Explain why list building test is so much faster with the modified increaseCapacity() method than the original implementaion.

Question 3. Modify the ListBuildTimer.java test so that new elements are added index 0 instead of index size (and keep the modified increaseCapacity() method). How does the running time compare to the original ListBuildTimer impelementation? How can you explain the discrepancy?