# Lecture 06: Sorting by Divide and Conquer II

### COSC 311 Algorithms, Fall 2022

$\def\compare{ {\mathrm{compare}} } \def\swap{ {\mathrm{swap}} } \def\sort{ {\mathrm{sort}} } \def\insert{ {\mathrm{insert}} } \def\true{ {\mathrm{true}} } \def\false{ {\mathrm{false}} } \def\BubbleSort{ {\mathrm{BubbleSort}} } \def\SelectionSort{ {\mathrm{SelectionSort}} } \def\Merge{ {\mathrm{Merge}} } \def\MergeSort{ {\mathrm{MergeSort}} }$

## Announcement

Accountability Groups

## Overview

1. MergeSort
2. Running Time of Merge Sort
3. QuickSort

## Previously

• Sorting meets Divide & Conquer

• MergeSort: Divide by Index

1. divide $a$ into halves $m = (n+1) / 2$
2. sort $a[1..m-1]$ recursively
3. sort $a[m..n]$ recursively
4. merge $a[1..m-1]$ and $a[m..n]$ to form sorted array

## Pseudocode

# sort values of a between indices i and j-1
MergeSort(a, i, j):
if j - i = 1 then
return
endif
m <- (i + j) / 2
MergeSort(a,i,m)
MergeSort(a,m,j)
Merge(a,i,m,j)


## Correctness of MergeSort

Establish two claims:

Claim 1 (merge). If $a[i..m-1]$ and $a[m..j]$ are sorted, then after $\Merge(a, i, m, j)$, $a[i..j]$ is sorted.

• Argued on lecture ticket!

Claim 2. For any indices $i < j$, after calling $\MergeSort(a, i, j)$, $a[i..j]$ is sorted.

• Argue by Induction!

## Pseudocode Again

00  # sort values of a between indices i and j-1
01  MergeSort(a, i, j):
02    if j - i = 1 then
03      return
04    endif
05    m <- (i + j) / 2
06    MergeSort(a,i,m)
07    MergeSort(a,m,j)
08    Merge(a,i,m,j)


## Inductive Claim

Consider $\MergeSort(a, i, j)$, define $k = j - i$ to be size

$P(k)$: for every $k’ \leq k$, $\MergeSort(a, i, j)$ with size $k’$ succeeds

Base case $k = 1$:

Inductive step $P(k) \implies P(k+1)$:

## Question

How efficient is MergeSort?

00  # sort values of a between indices i and j-1
01  MergeSort(a, i, j):
02    if j - i = 1 then
03      return
04    endif
05    m <- (i + j) / 2
06    MergeSort(a,i,m)
07    MergeSort(a,m,j)
08    Merge(a,i,m,j)


## Analyzing Running Time

00  # sort values of a between indices i and j-1
01  MergeSort(a, i, j):
02    if j - i = 1 then
03      return
04    endif
05    m <- (i + j) / 2
06    MergeSort(a,i,m)
07    MergeSort(a,m,j)
08    Merge(a,i,m,j)


Observation 1. Let $k = j - i$ be the size of the method call $\MergeSort(a, i, j)$. Then running time is $O(k) +$ running time of recursive calls on lines 6-7.

Observation 2. Recursive calls have size $k / 2$.

• Assume size is power of 2

## Recall Logarithms (base 2)

Define $\log$ by

• $\log a = b \iff 2^b = a$

Another way

• $\log a$ is # times $a$ can be divided by $2$ to get (at most) $1$.

Facts.

1. For every constant $c > 0$, $\log n = O(n^c)$.
2. $\log n \neq O(1)$.

## A Final Calculation

• Running time $T(n)$ of $\MergeSort(a, 1, n+1)$:

\begin{align*} T(n) &= 2 T(n/2) + O(n)\\ &= 4 T(\frac n 4) + 2 O(\frac n 2) + O(n)\\ &= 8 T(\frac n 8) + 4 O(\frac n 4) + 2 O(\frac n 2) + O(n)\\ &\vdots\\ &= n T(1) + \frac n 2 O(2) + \cdots + 8 O(\frac n 8) + 4 O(\frac n 4) + 2 O(\frac n 2) + O(n)\\ &= O(n) + O(n) + \cdots + O(n) + O(n) + O(n)\\ \end{align*}

## Picture so Far:

SelectionSort. $O(n^2)$ operations

• $O(n^2)$ comparisons
• $O(n)$ swaps

BubbleSort and InsertionSort. $O(n^2)$ operations

• $O(n^2)$ comparisons
• $O(n^2)$ swaps

MergeSort. $O(n \log n)$ operations

• $O(n \log n)$ comparisons
• $O(n \log n)$ modifications
• uses $O(n)$ space overhead

## QuickSort: Another D&C Sort

Idea. Divide array $a$ by value

• choose a value $p$ from $a$, the pivot
• arrage values of $a$ such that:
• $p$ is at index $k$
• values $\leq p$ are at indices $i \leq k$
• values $> p$ are at indices $j > k$
• recursively sort indices $i < k$
• recursively sort indice $j > k$

## QuickSort Pseudocode

QuickSort(a, i, j):
if j - i <= 1 then
return
endif
p <- GetPivot(a, i, j) # select a pivot
k <- Split(a, i, j, p)
QuickSort(a, i, k-1)
QuickSort(a, k+1, j)


## Random Pivot Selection

GetPivot(a, i, j):
k <- RandomInt(i, j)
return a[k]


## A Heuristic

What is a “good” pivot choice?

How likely is a good pivot to be chosen?

## Careful Analysis

Can show. If random pivot is chosen, then on average QuickSort uses $O(n \log n)$ operations

## Next Time

• Lower bounds for sorting