Will Rosenbaum | Lecture 02 Ticket Solution

\[\def\compare{ {\mathrm{compare}} } \def\swap{ {\mathrm{swap}} } \def\sort{ {\mathrm{sort}} } \def\true{ {\mathrm{true}} } \def\false{ {\mathrm{false}} }\]

Consider the problem of sorting an array $a$ of $n$ numerical values. Suppose you can access and modify $a$ through two methods:

$\compare(a, i, j)$ returns $\true$ if $a[i] > a[j]$ and $\false$ otherwise
$\swap(a, i, j)$ swaps the values $a[i]$ and $a[j]$ in the array. That is, if before calling $\swap$ we had $a[i] = x$ and $a[j] = y$, then after performing $\swap(a, i, j)$, we would have $a[i] = y$ and $a[j] = x$, and the other values in $a$ would be unaffected.

A natural strategy for sorting $a$ is the following. Scan through the array to find the index $i_1$ of the smallest value in $a$, then swap $a[i_1]$ with $a[1]$ so that the smallest value is at index $1$ in the array after the the $\swap$. Then do the same for the second smallest value in the array: find the index $i_2$ storing the second smallest value, and swap it with index $2$ so that the second smallest value is stored at index $2$ after the swap. Continue in this way until the array is sorted in increasing order.

Express the sorting procedure described above as a method $\sort(a)$ in pseudocode.
If the array $a$ has size $n$, how many $\compare$ operations does your procedure use in the worst case? How many $\swap$ operations does $\sort$ use in the worst case? (Give as precise expressions as you can.)

Solution

Consider the following pseudocode:

sort(a):
  n <- size(a)
  for j = 1 to n - 1 do
    min <- j
    for i = j+1 to n do
      if compare(a, min, i) 
        min <- i
      endif
    endfor
    swap(a, j, min)
  endfor
end

Note that the inner loop from lines 5–9 finds the index of the smallest element between indices j and n. The swap in line 10 moves this element to index j. Thus, after a single iteration of the inner loop, the smallest element in the range j..n appears at index j.

The outer loop iterates over j = 1, 2,...,n-1. Thus, after the first iteration, the smallest element is at index 1. After the second iteration, the smallest remaining element–i.e., the second smallest element in the array–is stored at index 2, and so on.

To compute the number of compare operations performed by sort, observe that in iteration j of the outer loop, the inner loop calls compare $n - j$ times. So for iterations $j = 1, 2,\ldots, n-1$, the number of compare operations per iteration is $n - 1, n - 2, \ldots, 1$. Thus, the total number of compare operations is

\[(n - 1) + (n - 2) + \cdots + 1 = \frac 1 2 n (n - 1) = \frac 1 2 n^2 - \frac 1 2 n.\]

(See below for a trick for computing this sum.) The total number of swap operations is $n - 1$, because each iteration of the outer loop only performs a single swap.

Computing the Sum

The sum of numbers $T_n = 1 + 2 + \cdots + n$ is known as the $n$-th triangular number. A neat trick to evaluate $T_n$—often attributed to the mathematician Carl Friedrich Gauss—is to observe that if we write the sum defining $T_n$ backwards and add it again to $T_2$ we obtain

\[\begin{array}{rccccccc} 2 T_n = & 1 & + & 2 & + \cdots + & (n-1) & + & n\\ + & n & + & (n-1) & + \cdots + & 2 & + & 1 \end{array}\]

Now each column of the expression above sums to $n + 1$ and there are $n$ columns, so we get $2 T_n = (n + 1) n$. Finally, dividing by $2$ gives

\[T_n = \frac 1 2 (n+1) n.\]

While this expression is correct (and I find its derivation reasonably convincing), a mathematician might reasonably object that our justification is not rigorous. Later, we will see how to give a mathematically rigorous justification of the formula for $T_n$ using induction.