## List of Topics

Below are a list of topics that Midterm I will cover. The exam will include the following cheat sheet:

In particular, you do not need to memorize anything on the cheat sheet, although you should understand the material on the cheat sheet.

##### Pseudocode
• Write precise pseudocode to describe algorithms.
• Simulate (given) pseudocode by hand to determine behavior/output of a procedure.

##### Induction
• Understand statement of induction.
• Recall structure of a proof by induction (base case and inductive step).
• Apply induction to establish correctness of iterative procedures (loop invariants).
• Apply induction to establish correctness of recursively defined procedures.

##### Asymptotic Analysis
• Understand the meaning of $O$, $\Theta$, and $\Omega$ notation
• Analyze pseudocode to determine $O$ running time of (iterative) procedures.
• Write a recurrence relation for worst-case runnning time of recursively defined procedures.

• Asymptotic Analysis notes
• AI Chapter 2 (Asymptotic Notation)
• KT Section 2.2,4 (Asymptotic Order of Growth and A Survey of Common Running Times)
• AA Section 1.1 (Runtime Complexity)
##### Divide and Conquer
• Understand the divide and conquer paradigm/strategy and how it can be applied to solve algorithm problems.
• Devise recursive methods to apply the divide and conquer strategy to a given problem.
• Apply the Master Theorem to solve recurrence relation for a divide and conquer solution.

• AI Chapter 3 (Divide-and-Conquer Algorithms)
• KT Chapter 5 (Divide and Conquer)
• AA Chapter 5 (Divide-and-Conquer Algorithms)
##### Specific Algorithms and Problems
• Sorting
• elementary procedures: InsertionSort, SelectionSort, BubbleSort
• divide and conquer: MergeSort, QuickSort, RadixSort
• Divide and Conquer
• binary search
• Karatsuba multiplication
• profit maximization algorithm

## Example Questions

Question 1 (invariant for InsertionSort). Consider the following InsertionSort method:

 1 2 3 4 5 6 7 8 InsertionSort(a): for i = 2 to n do j <- i while j > 1 and a[j-1] > a[j] do swap(a, j-1, j) j <- j-1 endwhile endfor

Note that after the $$k$$th iteration of the inner while loop, $$j = i - k$$. Use induction (on $$k$$) to argue that after the $$k$$th iteration of the inner while loop, $$a[j] = a[i-k]$$ is the smallest value in $$a[j..i]$$.

Question 2 (asymptotic analysis). Consider the following functions:

\begin{align*} f_1(n) &= 5 n \log n + 16 n\\ f_2(n) &= (\log n)^2\\ f_3(n) &= \begin{cases}2n &n \text{ is odd}\\ \sqrt{n} &n \text{is even} \end{cases}\\ f_4(n) &\text{ satisfies the recursion relation } f_4(n) = 7 f_4(n / 2) + 3n^2 \end{align*}

Which of the functions above is…

1. …$$O(n)$$?
2. …$$O(n^2)$$?
3. …$$O(\log n)$$?
4. …$$\Theta(n \log n)$$?
5. …$$\Omega(n)$$?
6. …$$\Omega(n^2)$$?
7. …$$\Omega(\log n)$$?

(List all functions that apply to each condition.)

Question 3 (computing powers). For a number $$x$$ and non-negative integer $$n$$, we define the $$n$$th power of $$x$$, $$x^n$$ to be the $$n$$-fold product of $$x$$:

$x^n = \underset{n \text{ times}}{\underbrace{x \cdot x \cdots x}},$

with the convention that $$x^0 = 1$$. Below are two methods for computing $$x^n$$. The first method, Exp(x, n) uses a simple recursive method to compute $$x^n$$, while the second method DCExp(x, n) applies a divide and conquer strategy.

 1 2 3 4 5 6 Exp(x, n): if n = 0 then return 1 endif return x * Exp(x, n - 1)
 1 2 3 4 5 6 7 8 9 10 11 DCExp(x, n): if n = 0 then return 1 endif val <- DCExp(x, n / 2) if n % 2 = 0 then return val * val else return x * val * val

For the methods above, you may assume that “elementary” arithmetic operations (+, -, *, /, %) are performed in $$O(1)$$ time. Note that the division n/2 in DCExp is integer division, which always returns an integer value.

1. Use induction on $$n$$ to argue that the value returned by Exp(x, n) is $$x^n$$.

2. Use big O notation to describe the running time of Exp(x, n) as a function of $$n$$.

3. Simulate the code for DCExp by hand to compute DCExp(3, 5). Show your work!

4. Write a recurrence relation of the form $$T(n) = a T(n / b) + f(n)$$ for the running time of DCExp(x, n).

5. Apply the Master Theorem to your solution to part 4 to derive a bound on the running time of DCExp as a function of $$n$$.

Question 4. Suppose you are given access to a database of $$n$$ values $$v_1, v_2, \ldots, v_n$$, where each value is a number from the range $$1, 2,\ldots, N$$. To maintain privacy, you may not access the values $$v_i$$ directly. Instead, you may only access the database via queries of the form InRange(i, j), which returns the number of values $$v_i$$ that satisfy $$i \leq v_i \leq j$$. Using this limited access to the database, you wish to find the median of the values. (Recall that the median of $$n$$ values is a number $$m$$ such that at least half of the values are $$\leq m$$ and at least half are $$\geq m$$.)

1. Suppose $$m$$ is a median for the database. Given a guess $$k$$ of the median value, how could you use $$O(1)$$ queries of the form InRange(i, j) to determine if (1) $$k$$ is a median, (2) $$k < m$$, or (3) $$k > m$$?

2. Using your solution to 1, devise a divide and conquer algorithm that uses $$O(\log N)$$ InRange queries to find a median of the values in the database.

## Solutions

Question 1 (invariant for InsertionSort). Consider the following InsertionSort method:

 1 2 3 4 5 6 7 8 InsertionSort(a): for i = 2 to n do j <- i while j > 1 and a[j-1] > a[j] do swap(a, j-1, j) j <- j-1 endwhile endfor

Note that after the $$k$$th iteration of the inner while loop, $$j = i - k$$. Use induction (on $$k$$) to argue that after the $$k$$th iteration of the inner while loop, $$a[j] = a[i-k]$$ is the smallest value in $$a[j..i]$$.

Solution. As suggested, we argue by induction on $$k$$, the number of iterations of the inner while loop. Specifically, we must show the base case (that the claim holds for the smallest value of $$k$$), and the inductive step (if the claim holds for any value of $$k$$, then the claim also holds for $$k + 1$$).

Base case $$k = 1$$. Consider the first iteration of the while loop. In particular, since we entered the loop, the condition $$a[j-1] > a[j]$$ is satisfied. At line 5, we swap the values, so after the swap we have $$a[j-1] < a[j]$$. After decrementing $$j$$ in line 6, we get $$a[j] < a[j+1] (= a[i])$$. Thus, $$a[j]$$ is the smallest element in $$a[j..i]$$.

Inductive step. Suppose that after iteration $$k$$ (i.e., $$j = i - k$$) $$a[j]$$ is the smallest value in $$a[j..i]$$. The next iteration (iteration $$k + 1$$ of the while loop is only executed if $$a[j-1] > a[j]$$. In this case, the swap at line 5 is executed, so that after the swap $$a[j-1] < a[j]$$. Since no other entries of $$a$$ are modified, after the swap $$a[j-1]$$ is the smallest element in $$a[j-1..i]$$. When $$j$$ is decremented in line 6, this means that $$a[j]$$ is the smallest elements in $$a[j..i]$$.

Since the base case and the inductive step hold, the claim follows from induction.

Question 2 (asymptotic analysis). Consider the following functions:

\begin{align*} f_1(n) &= 5 n \log n + 16 n\\ f_2(n) &= (\log n)^2\\ f_3(n) &= \begin{cases}2n &n \text{ is odd}\\ \sqrt{n} &n \text{is even} \end{cases}\\ f_4(n) &\text{ satisfies the recursion relation } f_4(n) = 7 f_4(n / 2) + 3n^2 \end{align*}

Which of the functions above is…

1. …$$O(n)$$?
2. …$$O(n^2)$$?
3. …$$O(\log n)$$?
4. …$$\Theta(n \log n)$$?
5. …$$\Omega(n)$$?
6. …$$\Omega(n^2)$$?
7. …$$\Omega(\log n)$$?

Solution.

First we apply the Master Theorem to $$f_4(n)$$. In this case, we have $$a = 7$$, $$b = 2$$ and $$f(n) = 3n^2$$. We compute the constant $$c = \log_2 7$$, which is strictly between 2 and 3. Since $$f(n) = O(n^2)$$ and $$2 < \log_2 7$$, we are in case 1 of the Master Theorem. This allows us to conclude that $$f_4(n) = O(n^c)$$.

Which of the functions above is…

1. …$$O(n)$$?: $$f_2, f_3$$
2. …$$O(n^2)$$?: $$f_1, f_2, f_3$$
3. …$$O(\log n)$$?: none
4. …$$\Theta(n \log n)$$?: $$f_1$$
5. …$$\Omega(n)$$?: $$f_1, f_4$$
6. …$$\Omega(n^2)$$?: $$f_4$$
7. …$$\Omega(\log n)$$?: $$f_1, f_2, f_3, f_4$$

Question 3 (computing powers). For a number $$x$$ and non-negative integer $$n$$, we define the $$n$$th power of $$x$$, $$x^n$$ to be the $$n$$-fold product of $$x$$:

$x^n = \underset{n \text{ times}}{\underbrace{x \cdot x \cdots x}},$

with the convention that $$x^0 = 1$$. Below are two methods for computing $$x^n$$. The first method, Exp(x, n) uses a simple recursive method to compute $$x^n$$, while the second method DCExp(x, n) applies a divide and conquer strategy.

 1 2 3 4 5 6 Exp(x, n): if n = 0 then return 1 endif return x * Exp(x, n - 1)
 1 2 3 4 5 6 7 8 9 10 11 DCExp(x, n): if n = 0 then return 1 endif val <- DCExp(x, n / 2) if n % 2 = 0 then return val * val else return x * val * val

For the methods above, you may assume that “elementary” arithmetic operations (+, -, *, /, %) are performed in $$O(1)$$ time. Note that the division n/2 in DCExp is integer division, which always returns an integer value.

1. Use induction on $$n$$ to argue that the value returned by Exp(x, n) is $$x^n$$.

2. Use big O notation to describe the running time of Exp(x, n) as a function of $$n$$.

3. Simulate the code for DCExp by hand to compute DCExp(3, 5). Show your work!

4. Write a recurrence relation of the form $$T(n) = a T(n / b) + f(n)$$ for the running time of DCExp(x, n).

5. Apply the Master Theorem to your solution to part 4 to derive a bound on the running time of DCExp as a function of $$n$$.

Solution.

1. Use induction on $$n$$ to argue that the value returned by Exp(x, n) is $$x^n$$.

Base case. In the base case $$n = 0$$, the method returns $$1 = n^0$$ at line 3, as desired.

Inductive step. Suppose that for some value of $$n$$, Exp(x, n) returns $$x^n$$. Then Exp(x, n+1) returns $$x * \mathrm{Exp}(x, n) = x * x^n = x^{n+1}$$, as claimed. The first equality holds by the inductive hypothesis.

Since the base case and inductive step hold, the claim follows.

2. Use big O notation to describe the running time of Exp(x, n) as a function of $$n$$.

The running time is $$O(n)$$ (assuming arithmetic is performed in $$O(1)$$ time).

3. Simulate the code for DCExp by hand to compute DCExp(3, 5). Show your work!

• DCExp(3, 5) sets val <- DCExp(3, 2) at line 6
• DCExp(3, 2) sets val <- DCExp(3, 1) at line 6
• DCExp(3, 1) sets val <- DCExp(3, 0) at line 6
• DCExp(3, 0) returns 1 at line 3
• val <- 1 at line 6
• DCExp(3, 1) returns 3 * 1 * 1 = 3 at line 11
• val <- 3 at line 6
• DCExp(3, 2) returns 3 * 3 = 9 at line 9
• val <- 9 at line 6
• DCExp(3, 5) returns 3 * 9 * 9 = 243 at line 11
4. Write a recurrence relation of the form $$T(n) = a T(n / b) + f(n)$$ for the running time of DCExp(x, n).

$T(n) = 1 \cdot T(n / 2) + O(1)$
5. Apply the Master Theorem to your solution to part 4 to derive a bound on the running time of DCExp as a function of $$n$$.

Since $$c = \log_b a = \log_2 1 = 0$$, we have $$f(n) = O(n^0) = O(n^c)$$, we we are in case 2 of the Master Theorem. Thus, the theorem gives us a running time of $$O(\log n)$$.

Question 4. Suppose you are given access to a database of $$n$$ values $$v_1, v_2, \ldots, v_n$$, where each value is a number from the range $$1, 2,\ldots, N$$. To maintain privacy, you may not access the values $$v_i$$ directly. Instead, you may only access the database via queries of the form InRange(i, j), which returns the number of values $$v_i$$ that satisfy $$i \leq v_i \leq j$$. Using this limited access to the database, you wish to find the median of the values. (Recall that the median of $$n$$ values is a number $$m$$ such that at least half of the values are $$\leq m$$ and at least half are $$\geq m$$.)

1. Suppose $$m$$ is a median for the database. Given a guess $$k$$ of the median value, how could you use a single query of the form InRange(i, j) to determine if (1) $$k$$ is a median, (2) $$k < m$$, or (3) $$k > m$$?

Solution. Observe that calling InRange(k, N) computes the number of elements in the database whose value is at least $$k$$ and InRange(0, k) computes the number of elements whose value is at most $$k$$. Thus if $$\mathrm{InRange}(k, N) \geq n / 2$$, and $$\mathrm{InRange}(0, k) \geq n / 2$$ then $$k$$ is a median. Otherwise, if $$\mathrm{InRange}(0, k) < n / 2$$, then $$k < m$$ for any median $$m$$, and if $$\mathrm{InRange}(k, N) < n / 2$$, then $$k > m$$ for any median $$m$$.

2. Using your solution to 1, devise a divide and conquer algorithm that uses $$O(\log N)$$ InRange queries to find a median of the values in the database.

Solution.

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 FindMedian(): retun FindMedian(0, N) # Find a median value between i and j using binary search FindMedian(i, j): n <- InRange(0, N) # total number of elements in database k <- (i + j) / 2 left <- InRange(0, k) right <- InRange(k, N) if left >= n / 2 and right >= n / 2 then return k else if right < n / 2 then return FindMedian(i, k) else return FindMedian(k, j)