Will Rosenbaum | Asymptotic Analysis and Big O Notation

Motivation

In this note, we describe a method of measuring the efficiency of a procedure using asymptotic analysis. We formally define “big O” notation for qualitatively comparing the growth of functions, as well as the related notations \(\Omega\) and \(\theta\). We will often apply big O notation to represent the running times or number of elementary operations performed by an algorithm. While it is sometimes possible to compute the precise number of operations performed by an algorithm, knowing this value may not be particularly practical. This is because the actual running time (or any other measure of efficiency) of an implementation will generally depend on details of the implementation and hardware executing the algorithm. Thus, precisely counting operations may not give the most meaningful measure of the efficiency of a procedure.

Asymptotic analysis abstracts away many details such as constant factors

details of hardware executing the program, and focuses on how the time complexity (e.g., running time) scales with the size of the input. The approach disregards details of executions—essentially ignoring constant factors that affect the running time—but allows us to reason about efficiency in a way that is (almost) independent of the actual hardware executing a procedure.

Big O Notation

Throughout this section, we will use \(f, g\) and \(h\) to refer to functions from the natural numbers \(\mathbf{N} = \{0, 1, 2, \ldots, \}\) to the positive real numbers \(\mathbf{R}^+\). The interpretation is that \(f(n)\) might be the running time of some method with an input of size \(n\).

Formal Definition

Definition. Let \(f, g : \mathbf{N} \to \mathbf{R}^+\) bef functions from the natural numbers to the positive real numbers. Then we write \(f = O(g)\) (pronounced ”\(f\) is (big) oh of \(g\)“) if there exists a natural number \(N\) and positive constant \(C > 0\) such that for all \(n \geq N\), we have

\[f(n) \leq C \cdot g(n).\]

Informally, this definition says that \(f = O(g)\) means that for sufficiently large values of \(n\) (i.e., \(n \geq N\)), \(f(n)\) is no more than a constant factor (\(C\)) larger than \(g(n)\).

Arguing directly from the definition, in order to show that functions \(f\) and \(g\) satisfy \(f = O(g)\), we must find values \(N\) and \(C\) such that \(f(n) \leq C \cdot g(n)\) whenever \(n \geq N\). In the following section, we will derive properties of \(O\) that allow us to rigorously justify our calculations without needing to refer directly to the definition above.

Example. Consider the functions \(f(n) = 10 n^3 + 7\) and \(g(n) = n^3\). Notice that \(f(n) \geq g(n)\) for all (non-negative) values of \(n\). Thus from the definition of \(O\), we have \(g = O(f)\). Indeed, taking \(N = 0\) and \(C = 1\), the definition is satisfied.

On the other hand, we claim athat \(f = O(g)\) as well. To prove this directly from the definition, we must find suitable values \(N\) and \(C\) satisfying the definition above. Since \(f(n) \geq 10 n^3 = 10 \cdot g(n)\), we must use some \(C > 10\) (otherwise \(f(n) \leq C \cdot g(n)\) will not be satisfied). Consider taking \(C = 11\). We can write

\[\begin{align*} 11 g(n) &= 11 n^3\\ &= 10 n^3 + n^3\\ &\geq 10 n^3 + 2^3 (\text{ if } n \geq 2)\\ &= 10 n^3 + 8\\ &> 10 n^3 + 7\\ &= f(n). \end{align*}\]

Thus, so long as \(n \geq 2\), we have \(f(n) \leq 11 \cdot g(n)\). Therefore, \(f = O(g)\), where the definition is satisfied for \(N = 2\) and \(C = 11\).

The computations above are rather ad-hoc. In general, there will be many possible values of \(N\) and \(C\) for which \(f\) and \(g\) can be show to satisfy the definition. Below, we will describe general properties from which \(O\)-relationships can be determined rigorously without devolving to ad-hoc algebra.

Abuse of Notation. We often use the notation \(O(g)\) to refer to “some function \(h\) that satisfies \(h = O(g)\).” In the example above, we showed that \(f(n) = 10 n^3 + 7\) satsifies \(f = O(n^3)\). If we were considering the function \(h(n) = 2 n^4 + 3 n^3 + 7 = 2 n^4 + f(n)\), we may just as well write \(h(n) = 2 n^4 + O(n^3)\). This shorthand will prove convenient when we don’t want to write out explicit terms of functions whose precise values are unknown or will be subsumed by an application of \(O\) later on.

Properties

Here, we prove some useful properties of big O notation.

Proposition 1. Suppose \(f, g, f_1, f_2, g_1, g_2, h\) are all functions from \(\mathbf{N}\) to \(\mathbf{R}^+\), and that \(a > 0\) is a constant in \(\mathbf{R}^+\). The the following hold:

If \(f(n) \leq a\) for all \(n\), then \(f = O(1)\).
If \(f(n) \leq g(n)\) for all \(n\), then \(f = O(g)\).
If \(f = O(g)\), then \(a \cdot f = O(g)\).
If \(f = O(g)\) and \(g = O(h)\), then \(f = O(h)\).
If \(f = O(g)\), then \(f + O(g) = O(g)\) and \(g + O(f) = O(g)\),
- in particular \(f + g = O(g)\)
If \(f_1 = O(g_1)\) and \(f_2 = O(g_2)\), then \(f_1 \cdot f_2 = O(g_1 \cdot g_2)\).

Proof. We prove each assertion above in turn.

Since \(f(n) \leq a\) for all \(n\), taking \(g(n) = 1\), we have \(f(n) \leq a \cdot g(n)\) for all \(n\). Thus, the definition of \(f = O(g)\) is satisfied with \(N = 0\) and \(C = a\), so that \(f = O(1)\).
If \(f(n) \leq g(n)\) for all \(n\), then the definition of \(f = O(g)\) is satisfied with \(N = 0\) and \(C = 1\).
Suppose \(f = O(g)\), and suppose \(N'\) and \(C'\) are the values for which the definition of \(O\) is satisfied. That is, for all \(n \geq N'\), we have \(f(n) \leq C' \cdot g(n)\). Then for \(n \geq N\), we also have \(a \cdot f(n) \leq a C' \cdot g(n)\). Therefore, the definition of \(a \cdot f = O(g)\) is satisfies for \(N = N'\) and \(C = a \cdot C'\).
Suppose the definition \(f = O(g)\) is satisfied with values \(N_f\) and \(C_f\). That is, for \(n \geq N_f\), we have \(f(n) \leq C_f \cdot g(n)\). Similarly, suppose \(g = O(h)\) with \(N_g\) and \(C_g\): for \(n \geq N_g\) we have \(g(n) \leq C_g \cdot h(n)\). Then, for \(n \geq \max(N_f, N_g)\), we have both \(f(n) \leq C_f g(n)\) and \(g(n) \leq C_g \cdot h(n)\). Combining the last two inequalities, we obtain \(f(n) \leq C_f \cdot (C_g \cdot h(n)) = C_f C_g \cdot h(n)\). Therefore, the definition of \(f = O(h)\) is satisfied for \(N = \max(N_f, N_g)\) and \(C = C_f \cdot C_g\).
Suppose \(f = O(g)\) and \(h\) is any function satisfying \(h = O(g)\). Suppose the definition of \(f = O(g)\) is satisfied for \(N_f\) and \(C_f\)—i.e., \(f(n) \leq C_f \cdot g(n)\) for all \(n \geq N_f\). Similarly suppose the definition of \(h = O(g)\) is satisfied for values \(N_h\) and \(C_h\): \(h(n) \leq C_h \cdot g(n)\) for all \(n \geq N_h\). Observe that taking \(N = \max(N_f, N_h)\), we have that for all \(n \geq N\), both \(f(n) \leq C_f \cdot g(n)\) and \(h(n) \leq C_h \cdot g(n)\). Thus, for \(n \geq N\), we have \(f(n) + h(n) \leq C_f \cdot g(n) + C_h \cdot g(n) = (C_f + C_h) g(n)\). Therefore, the definition of \(f + h = O(g)\) is satisfied for \(N = \max(N_f, N_h)\) and \(C = C_f + C_h\).

Now suppose \(f = O(g)\), and \(h = O(f)\). Then by property 4 above, we have \(h = O(g)\). Therefore, \(g + h = g + O(g)\). Applying the first assertion of 5 (proven in the paragraph above), we get \(g + O(g) = O(g)\), so that \(g + O(f) = g + O(g) = O(g)\), as claimed.
Suppose \(f_1 = O(g_1)\) is satisfied for \(N_1\) and \(C_1\), and that \(f_2 = O(g_2)\) is satisfied for \(N_2\) and \(C_2\). Then for \(N = \max(N_1, N_2)\), and all \(n \geq N\), we have \(f_1(n) \leq C_1 \cdot g_1(n)\), and \(f_2(n) \leq C_2 \cdot g_2(n)\). Therefore, \((f_1 \cdot f_2)(n) = f_1(n) \cdot f_2(n) \leq (C_1 g_1(n)) (C_2 g_2(n)) = (C_1 C_2) \cdot (g_1 \cdot g_2)(n)\). Therefore, \(f_1 \cdot f_2 = O(g_1 \cdot g_2)\) is satisfied for \(N = \max(N_1, N_2)\) and \(C = C_1 \cdot C_2\).

So all of the properties hold, as desired. \(\Box\)

Using the properties above, we can more simply (yet just as rigorously) argue about big O notation. For example, Property 2 above also gives the following useful consequence:

Corollary. Suppose \(a, b\) are constants with \(a \leq b\). Then \(n^a = O(n^b)\).

Example. Suppose \(f\) is a second degree polynomial. That is, \(f(n) = a n^2 + b n + c\) for some constants \(a, b\) and \(c\), where \(a > 0\). Then \(f = O(n^2)\). To see this, we compute:

\[\begin{align*} f(n) &= a n^2 + b n + c\\ &= a n^2 + b n + O(1)\\ &= a n^2 + O(n)\\ &= O(n^2). \end{align*}\]

Each manipulation above is justified as follows

The first equality is the definition of \(f\).
The second equality holds by property 1.
The third equality holds by property 3 (which implies that \(b n = O(n)\)), and property 5.
The fourth equality holds by property 3 (which implies that \(a n^2 = O(n^2)\), and property 5.

More generally, we can argue that any degree \(k\) polynomial—i.e., a function \(f\) of the form \(f(n) = a_k n^k + a_{k-1} n^{k-1} + \cdots + a_1 n + a_0\) satisfies \(f = O(n^k)\). Proving this fact in a mathematically rigorous way requires applying mathematical induction, but you can freely use this fact going forward.

When is \(f \neq O(g)\)?

The notation \(f = O(g)\) is in some sense a weak condition on \(f\) and \(g\). That is, \(f\) could be much, much larger than \(g\), yet we we still have \(f = O(g)\). For example, take \(f = 10^{100} n^2\) and \(g = 10^{-100} n^2\). You should convince yourself that we we have \(f = O(g)\). But \(g\) is always much smaller than \(f\): \(g(n) / f(n) = 10^{-200}\) for all \(n\), which is a very tiny number indeed! So you might (rightfully) be concerned that \(f = O(g)\) is too weak of a condition to be useful. For example, is it ever the case that \(f\) is not \(O(g)\)?

Proposition 2. Suppose \(a\) and \(b\) are real values satisfying \(a < b\). Then \(n^b \neq O(n^a)\). That is, \(n^b\) is not \(O(n^a)\).

Given a proposition such as Proposition 2—a claim that a particular definition is not satisfied—it is common to apply a technique called “proof by contradiction”. That is, in order to prove that \(n^b \neq O(n^a)\), we assume that the opposite is true—namely \(n^b = O(n^a)\)—and derive a contradiction from this assumption.

Proof. Suppose for the sake of contradiction that \(n^b = O(n^a)\). Then—from the definition of \(O\)—there exists a natural number \(N\) and constant \(C\) such that for all \(n \geq N\), we have \(n^b \leq C \cdot n^a\). Dividing both sides of this expression by \(n^a\) we get the equivalent expression \(n^b / n^a = n^{b - a} \leq C\). Since \(b - a > 0\), we can take raise both sizes of this expression to the power \(1 / (b - a)\) to get

\[n = (n^{b - a})^{1 / (b - a)} \leq C^{1 / (b - a)}.\]

Thus, for all \(n > C^{1 / (b - a)}\), the inequality \(n^b \leq C \cdot n^a\) fails to hold. In particular, the expression \(n^b \leq C \cdot n^a\) fails to hold for some \(n \geq N\). Therefore, \(f \neq O(g)\), as desired. \(\Box\)

O, Omega, Theta

The notation \(f = O(g)\) can be thought of as saying that “\(f\) doesn’t grow more than a constant factor faster than \(g\).” On the other hand, we might want to indicate that \(f\) grows at least as quickly as (a constant times) \(g\). In this case we use the notation \(f = \Omega(g)\) (where \(\Omega\) is the capital Greek letter “omega”). More precisely we write \(f = \Omega(g)\) (read, “\(f\) is ‘big Omega’ of \(g\)”) if \(g = O(f)\). Unwrapping the definition of \(O\), we can expression the condition \(f = \Omega(g)\) as follows.

Definition. We say that \(f = \Omega(g)\) if there exists \(C > 0\) and a natural number \(N\) such that for all \(n \geq N\), we have \(f(n) \geq C \cdot g(n)\).

Informally, \(f = \Omega(g)\) means that \(f\) grows at least as quickly as (some constant times) \(g\).

Finally, we might wish to indicate that \(f\) and \(g\) both have the same asymptotic growth. This was the case in our original example with \(f(n) = 10 n^3 + 7\) and \(g(n) = n^3\). In this case we write \(f = \Theta(g)\) (read: “\(f\) is Theta of \(g\)”).

Definition. We say that \(f = \Theta(g)\) if \(f = O(g)\) and \(f = \Omega(g)\).

Exercise. Show the following:

If \(f = \Theta(g)\), then \(g = \Theta(f)\)
If \(f = O(g)\) and \(g = O(f)\), then \(f = \Theta(g)\).
If \(f = \Omega(g)\) and \(f = \Omega(f)\), then \(f = \Theta(g)\).

Hint. Use the (equivalent) definition of \(\Omega\) that \(f = \Omega(g)\) if \(g = O(f)\).