Lecture 02: Sorting and Induction

COSC 311 Algorithms, Fall 2022

$ \def\compare{ {\mathrm{compare}} } \def\swap{ {\mathrm{swap}} } \def\sort{ {\mathrm{sort}} } \def\insert{ {\mathrm{insert}} } \def\true{ {\mathrm{true}} } \def\false{ {\mathrm{false}} } \def\BubbleSort{ {\mathrm{BubbleSort}} } $

Announcements

  1. Accountability groups (message today)
  2. Office hours
    • Evening TA sessions Sunday, Wednesday (TBD)
    • My drop-in: Thursday 11-12, 2-3 (?)
    • By appointment: TBD
  3. Emails: subject includes [COSC 311]
  4. Section enrollment
  5. Lecture ticket reminder (read solutions!)

Today

  1. Sorting Task
  2. Insertion Sort
  3. Induction

Task: Sorting

Input:

  • Sequence $a$ of $n$ numbers
  • e.g., $a = 17, 7, 5, 2, 3, 19, 5, 13$

Output:

  • A sorted sequence $s$ of same elements as $a$
    • $s$ contains same elements with same multiplicities as $a$
    • $s_1 \leq s_2 \leq \cdots \leq s_n$
  • e.g., $s = 2, 3, 5, 5, 7, 13, 17, 19$

So Far

Sorting task is underspecified!

  • Why?
  1. representation
  2. supported operations

Examples:

  • stack of exams
  • array of numbers
  • tasks by deadline

Each may support different operations & require different techniques to solve efficiently

Going Forward

Spend ~2 weeks on sorting

  • Elementary algorithms
    • argue correctness
      • mathematical induction
    • argue running time
      • big O notation
  • Divide-and-conquer algorithms
    • algorithms: MergeSort, QuickSort, RadixSort
    • argue running time
      • “master method”

Sorting Arrays

Representation:

  • $a$ an array of size $n$
  • $a[1], a[2],\ldots, a[n]$

Supported Operations

  • $\compare(a, i, j)$
    • return $\true$ if $a[i] > a[j]$ and $\false$ otherwise
  • $\swap(a, i, j)$
    • before $a[i] = x$ and $a[j] = y$
    • after $a[i] = y$ and $a[j] = x$

Example

$a = [17, 7, 5, 2, 3, 19, 5, 13]$

  • $\compare(a, 2, 6)$?
  • $\swap(a, 2, 5)$?

Central Tenet

Break a large task into smaller subtasks.

Lecture Ticket

Express “selection sort” in pseudocode

  • find smallest element and put it at index 1
  • find second smallest element and put it at index 2
  • find third smallest element and put it at index 3

Example

  • Sorting a small array:

    \[\begin{align*} a &= [5, 2, 1, 3, 4]\\ &{}\\ &\to [1, 2, 5, 3, 4]\\ &{}\\ &\to [1, 2, 5, 3, 4]\\ &{}\\ &\to [1, 2, 3, 5, 4]\\ &{}\\ &\to [1, 2, 3, 4, 5] \end{align*}\]

SelectionSort in Pseudocode

01  SelectionSort(a):
02    n <- size(a)
03    for j = 1 to n - 1 do
04      min <- j
05      for i = j+1 to n do
06        if compare(a, min, i) 
07          min <- i
08        endif
09      endfor
10      swap(a, j, min)
11    endfor

Why does SelectionSort Work?

Arguing Correctness

Goal. Logically deduce that algorithm succeeds on all inputs.

To do:

  • specify task
  • specify allowed operations and effects
  • specify algorithm
  • demonstrate that on all possible inputs, algorithm output satisfies task specification

A Remark

It may be “obvious” to you that SelectionSort works.

  • give formal analysis of algorithm here
  • introduce tools that will help when things become less obvious

Specifying the Sorting Task

Input. Array $a$ of numbers

Output. Sorted array $s$:

  1. $s$ contains the same elements as $a$
  2. $s$ is sorted: $s[1] \leq s[2] \leq \cdots \leq s[n]$
    • for every index $i < n$, $s[i] \leq s[i+1]$

Allowed Operations

  • $\compare(a, i, j)$: return $\true$ if $a[i] > a[j]$
  • $\swap(a, i, j)$:
    • before $\swap$ have $a[i] = x$ and $a[j] = y$
    • after $\swap$ have $a[i] = y$ and $a[j] = x$

Observation. If $s$ is array formed from $a$ by any sequence of $\swap$ operations, then $s$ and $a$ contain the same elements.

  • Item (1) from sorting task is satisfied for any procedure that only modifies the array with swaps

Next Step

Claim. The output of SelectionSort(a) is sorted.

01  SelectionSort(a):
02    n <- size(a)
03    for j = 1 to n - 1 do
04      min <- j
05      for i = j+1 to n do
06        if compare(a, min, i) 
07          min <- i
08        endif
09      endfor
10      swap(a, j, min)
11    endfor
12  end

Question. Why does iteration $j$ select $j$th smallest element in the array?

Inductive Reasoning

Question. Why does iteration $j$ select $j$th smallest element in the array?

04      min <- j
05      for i = j+1 to n do
06        if compare(a, min, i) 
07          min <- i
08        endif
09      endfor
10      swap(a, j, min)

Reason. (informal)

  1. Loop in lines 5-9 selects smallest value in a[j..n]
  2. Previous steps moved smaller values to a[1..j-1]

Moral. Step j succeeds because steps 1, 2,...,j-1 succeeded

  • inductive reasoning

Induction

A Bit of Formalism

A logical predicate $P$ is a statement that can be true or false

Examples.

  • $P = $ “it is raining today”
  • $P = $ “$57$ is a prime number”
  • $P = $ “the output of SelectionSort([5,2,7,1]) is sorted”

Sequences of Predicates

A logical principle to establish the truth of a sequence of predicates

Example. $a$ an array of numbers of size $n$:

  • $P(1)$: $a[1] \leq a[2]$
  • $P(2)$: $a[2] \leq a[3]$
  • $P(3)$: $a[3] \leq a[4]$

  • $P(n-1)$: $a[n-1] \leq a[n]$

Question. How does “sortedness” relate to the predicates $P(1), P(2), \ldots, P(n-1)$?

Principle of Induction

Setup. $P(1), P(2), P(3), \ldots$ a sequence of predicates

Hypotheses. Suppose:

  1. $P(1)$ is true (base case)
  2. If $P(i)$ is true, then $P(i+1)$ is true (inductive step)

Conclusion. All of $P(1), P(2),\ldots$ are true

  • “For all $i$, $P(i)$”
  • logical notation: $\forall i, P(i)$

Ice Cream Example

Selection Sort Pseudocode

Claim. The output of SelectionSort(a) is sorted.

01  SelectionSort(a):
02    n <- size(a)
03    for j = 1 to n - 1 do
04      min <- j
05      for i = j+1 to n do
06        if compare(a, min, i) 
07          min <- i
08        endif
09      endfor
10      swap(a, j, min)
11    endfor
12  end

Induction and Selection Sort

Main Claim. After iteration $j$:

  1. $a[1..j]$ is sorted, and
  2. for every $k > j$, $a[k] \geq a[j]$

Proof by Induction. Must show

  1. Base case ($j = 1$)
  2. Inductive step (if claim holds for $j$, then claim holds for $j+1$)

A Smaller Claim

Sub-claim. Consider the inner loop:

04      min <- j
05      for i = j+1 to n do
06        if compare(a, min, i) 
07          min <- i
08        endif
09      endfor

When the loop terminates, min is the index of the minimum value of a[j..n].

Why?

Main Claim, Base Case

After iteration $j$, $a[1..j]$ is sorted, and for every $k > j$, $a[k] \geq a[j]$.

04      min <- j
05      for i = j+1 to n do
06        if compare(a, min, i) 
07          min <- i
08        endif
09      endfor
10      swap(a, j, min)

Main Claim, Inductive Step

After iteration $j$, $a[1..j]$ is sorted, and for every $k > j$, $a[k] \geq a[j]$.

04      min <- j
05      for i = j+1 to n do
06        if compare(a, min, i) 
07          min <- i
08        endif
09      endfor
10      swap(a, j, min)

Induction Continued

Main Claim. After iteration $j$:

  1. $a[1..j]$ is sorted, and
  2. for every $k > j$, $a[k] \geq a[j]$
  • Inductive Step:

Conclusion

The claim holds for all $j$: after iteration $j$

  1. $a[1..j]$ is sorted, and
  2. for $k > j$, $a[k] \geq a[j]$

In particular, when $j = n-1$, $a$ is sorted. (Why?)

Why So Pedantic?

Next Time

  1. Running time analysis & big O notation
  2. Divide and Conquer Strategy