Recursion, Stacks, and Induction

How to reason about recursively defined methods

In this note, we consider recursively defined methods. Our goals our twofold. First, we will describe at a conceptual level how a computer executes methods calls using the call stack. By understanding how the call stack works, one can (in principle) simulate an execution of a program by hand, thereby gaining intuition about how a program performs a potentially complicated task. Second, we will introduce mathematical induction—a logical tool that allows us to reason more formally about recursively defined methods. Specifically, by using induction, we can often prove that a given method produces its desired output.

Throughout this note, we will refer to the following method:

1
2
3
4
5
6
7
int f (int n) {
    if (n <= 1) {
        return 1;
    }
    int val = f(n - 1);
    return (2 * n - 1) + val;
}

While the method’s definition is concise, it is not readily apparent what the method does. Specifically, the method f is recursive: what f(n) returns depends on the value returned by f(n-1). It is clear that if n <= 1, then f(n) will return 1. But what is f(7)? Or f(279)?

As a first step towards understanding f, we could write a program that prints f(n) for a few small values of n. Here is the sample output of such a program:

1
2
3
4
5
f(1) = 1
f(2) = 4
f(3) = 9
f(4) = 16
f(5) = 25

Perhaps by now we’ve noticed a pattern. For example f(2) = 4 = 2 * 2, f(3) = 9 = 3 * 3, f(4) = 16 = 4 * 4, f(5) = 25 = 5 * 5. So far, it seems that if the input n is a positive int, then f(n) returns n * n. We can easily ask our program to print a few more values to see if they continue the pattern:

1
2
3
4
5
f(6) = 36
f(7) = 49
f(8) = 64
f(9) = 81
f(10) = 100

Indeed, the pattern continues! But are we convinced? After seeing the first 10 values of f(n), how certain should we be that we know what f does?

The sections below address two fundamental questions concerning (recursive) methods.

  1. How does your computer execute (recursive) method calls?
  2. How can we be certain that a (recursively defined) method always produces a proposed output?

The Call Stack

Whenever a Java program is executed, the execution maintains a data structure known as a call stack (sometimes referred to simply as “the stack”). As its name implies, the call stack is a realization of the stack ADT. The elements stored in the call stack are referred to as frames. At any point in the execution, each frame in the call stack corresponds to a method call that has been made, but has not yet completed (i.e., returned). The frame at the top of the call stack corresponds to the currently active method—i.e., the most recently made method call that has not yet returned.

For our purposes, we can think of each frame as storing the following information:

  1. The name and parameter values of the method being called.
  2. Local variables used in the method’s definition, as well as their values.
  3. The address of where to write the return value of the method (if any).
  4. The address of the caller’s frame (i.e., the method that made the the current method call, thereby creating its frame).

Note that in accordance with 4, we can view the call stack as a linked list frames, where each frame stores a reference to the frame below it in the call stack (i.e., the method from which the method was called). Each new method call pushes a new frame onto the call stack.

Example. Consider the following program:

1
2
3
4
5
6
7
8
9
10
11
12
13
public static void main(String[] args) {
    int n = foo(10);
}

public static int foo(int n) {
	int m = bar(2 * n);
	return k;
}

public static int bar(int n) {
    int m = n + 1;
    return m;
}

The program defines three methods: main(String[]), foo(int), and bar(int). Here is a line-by-line description of an execution and the corresponding state of the call stack:

  1. When we execute the program without command line parameters, we effectively call main(null). The execution begins with the main method, whose definition begins at line 1. At this point, a frame corresponding to main(null) is pushed onto the call stack, so its contents is:
    • call_stack = [main(null)]
  2. Following the instructions specified in the main method, we execute line 2, which says, “call foo(10) and write the returned value to the variable n.” The call to foo(10) creates a new frame corresponding to the call, which is pushed to the top of the stack:
    • call_stack = [main(null), foo(10)]
  3. Next, we begin execution of foo(10). The first substantive line, line 6, creates a new local variable m, and assigns it the value bar(2 * n), where n is the parameter passed to foo. Thus, in order to complete the execution of line 6, another method call is made, this time to bar(20). This method call creates another frame which is then pushed to the call stack:
    • call_stack = [main(null), foo(10), bar(20)]
  4. The call to bar(20) brings us to line 11. Here, a local variable m is created, and its value is set to 21 (= n + 1). With this statement executed, the execution continues to line 12. Here, the value of m (21) is returned. Specifically, the value 21 is written to the address specified by the calling method (the variable m for foo(10)). Once the value has been written, the call to bar(20) is complete, so it can be popped off the call stack. The execution then picks up where it left off in the previous method call: line 6 in foo(10):
    • call_stack = [main(null), foo(10)]
  5. Since the call to bar(20) returned the value 21, the execution finished up line 6, and m gets the value 21. The execution proceeds to line 7. This line returns the value of m (21) by writing it to the location specified during the initial call to main(null)—the variable n declared in line 2. Once this value is written, the call to foo(10) is complete so its corresponding frame is popped of the call stack. The execution returns to line 2.
    • call_stack = [main(null)]
  6. Finally, with the value returned by foo(10) (21) written to the local variable n, the execution of line 2 is complete, and the program terminates. \(\Box\)

As demonstrated in the example above, the call stack keeps track of the currently active method calls. The frame associated with each method call maintains a reference to the caller’s frame, giving rise to a linked list of frames representing a stack. Thus, the state of the call stack represents the the history of active method calls. Importantly, each individual method call creates a new frame with its own set of local variables. In particular, multiple method calls to the same method each maintain their own local variables. Since these method calls correspond to different frames, the computer is able to distinguish values of variables for multiple calls to the same method.

Now let us return to the method f(int) we introduced before:

1
2
3
4
5
6
7
int f (int n) {
    if (n <= 1) {
        return 1;
    }
    int val = f(n - 1);
    return (2 * n - 1) + val;
}

Even if we don’t have a computer handy, we can now simulate an execution of a call to, say f(3). Suppose main(null) contains the statement int m = f(3). We can imitate the execution as follows:

  1. main(null) calls f(3). f(3) gets pushed to the call stack
    • call_stack = [main(null), f(3)]
  2. In the call to f(3), the local variable n gets the value 3. The condition at line 2 is not satisfied, so the execution jumps to line 5. This statement makes a call to f(n - 1) = f(2), so f(2) gets pushed to the call stack.
    • call_stack = [main(null), f(3), f(2)]
  3. In the call to f(2), the local variable n gets the value 2. Again, the condition at line 2 is not satisfied, so the execution jumps to line 5. This statement calls f(n - 1) = f(1), so f(1) gets pushed to the call stack.
    • call_stack = [main(null), f(3), f(2), f(1)]
  4. In the call to f(1), the local variable n gets the value 1. The condition at line 2 is now satisfied, so this method call returns the value 1. The frame corresponding to f(1) is popped off the call stack and the value 1 is returned. That is, the value 1 is written to the variable val in in line 5 the caller frame now at the top of the call stack (f(2)).
    • call_stack = [main(null), f(3), f(2)]
  5. The execution continues in the frame for f(2). After line 5 completes, val stores the value 1 (returned by f(1)). The execution proceeds to line 6, which returns the value (2 * n - 1) + val. Since in the frame f(2), n has a value 2 and val has value 1, the returned value is (2 * 2 - 1) + 1 = 4. As before, the return value is written to the variable val in the calling frame, f(3), and the frame f(1) is popped off the stack.
    • call_stack = [main(null), f(3)]
  6. The execution continues in the frame for f(3). After line 5 completes, val stores the value 4 (returned by f(2)). In this frame, n has the value 3, so the return value is (2 * n - 1) + 4 = (2 * 3 - 1) + 4 = 9. The f(3) frame is then popped off the call stack.
    • call_stack = [main(null)]

While performing the steps above explicitly is somewhat tedious, it can be helpful to do a few such computations by hand in order to gain intuition about precisely how a piece of code is producing its output.

Induction

The discussion in the previous section gives an indication of how a computer executes a program. Once we observe a pattern in the execution, however, we would like to have some way of knowing that the pattern continues. In the case of our running example, f, we observed that the first few values f returned were perfect squares. This observation might lead us to conjecture that f(n) always returns n * n—at least as long as n * n is smaller than the maximum value that can be stored in an int. But how can we know that f(100) will return 100 * 100 = 10_000 without actually computing f(100) (which, in turn, requires that we compute f(99), f(98), f(97),…)?

What we would like is a proof that “for all values of n, f(n) returns n * n.” This statement is potentially problematic to prove, as it seems to entail actually proving many statements:

  • f(1) returns 1
  • f(2) returns 4
  • f(3) returns 9

We verified the first few of these statements empirically, but we cannot hope to verify all of them directly in a reasonable amount of time!

The logical tool employed to prove a (possibly infinite) sequence of statements such as those above is called mathematical induction. A little more formally, suppose we have a sequence of statements—referred to as predicates—that we would like to prove: \(P(1), P(2), P(3), \ldots\). In our example, the predicate \(P(n)\) stands for the statement “f(n) returns n * n”. Mathematical induction is a technique that allows us to prove that all of the predicates \(P(n)\) are true for every value of \(n\).

A proof by induction consists of two parts: a base case, and an inductive step:

  1. The base case is to establish that the predicate \(P(1)\) (or sometimes \(P(k)\) for some other small value of \(k\)) is true.

  2. The inductive step is to establish that whenever a predicate \(P(n-1)\) is true, then \(P(n)\) is also true.

In the inductive step, the supposition that “\(P(n-1)\) is true” is known as the inductive hypothesis. To understand why establishing (1) and (2) above is enough to prove that \(P(n)\) is true for all \(n\), consider again our running example. If we are able to prove that (1) f(1) returns 1, and (2) if f(n-1) returns (n-1) * (n-1), then f(n) = n * n, then we can reason as follows:

  • By (1), f(1) returns 1.
  • Since f(1) = 1, applying (2) with n = 2 gives f(2) = 4
  • Since f(2) = 4, applying (2) with n = 3 gives f(3) = 9
  • Since f(3) = 9, applying (2) with n = 4 gives f(4) = 16

Thus, we only have to prove (1) and (2) above in order to establish that the method f always obeys the pattern we saw. So let’s prove it!

Proposition. Consider the method f defined here:

1
2
3
4
5
6
7
int f (int n) {
    if (n <= 1) {
        return 1;
    }
    int val = f(n - 1);
    return (2 * n - 1) + val;
}

The for all values n >= 1, f(n) returns n * n.

Proof. We argue by induction on n.

  1. Base case. For the base case, we must establish that f(1) returns 1 * 1 = 1. This follows immediately from lines 2 and 3 of the method definition.

  2. Inductive step. Here, we must establish that if f(n-1) returns (n-1) * (n-1), then f(n) returns n * n. To this end, suppose f(n-1) returns (n-1) * (n-1). Then in line 5, val gets the value (n-1) * (n-1). The value returned by f(n) is (2 * n - 1) + val = (2 * n - 1) + (n-1) * (n-1). Performing some algebra:

    \[\begin{align*} (2 n - 1) + (n - 1)(n - 1) &= 2 n - 1 + (n^2 - 2n + 1) = n^2. \end{align*}\]

    Thus, assuming that f(n-1) returns (n-1) * (n-1) the return value of f(n) is n * n.

Since we established the base case and induction step, the proposition holds by induction. \(\Box\)

The proof above follows a standard pattern of induction argumentation to establish the correctness of procedures/methods in computer science. By appealing to mathematical induction, we can reason about every possible execution of a program without having to actually perform every possible execution.

Mathematical induction lends itself naturally to reasoning about recursively defined methods. Typically, recursively defined methods have the following structure:

1
2
3
4
5
6
7
8
public T someMethod(...) {
    // check some conditions
    // if conditions are met, return (a value)
	
	// if conditions are not met
	// make arecursive function call
	// then return (a value)
}

Indeed, our method f was precisely of this form: under some condition (n <= 1), return a value (lines 2..4). Otherwise, make a recursive function call and return some value (lines 5,6). This pattern of code is similar to the pattern of a proof by induction: the first part of the method definition is used to establish the base case, while the second part is used to establish the inductive step. In this way, a well-organized method definition often gives a clue as to how to establish the correctness of the method.

Proof and Correctness: A Warning

Whenever we use pure logic to analyze a computer program, we must be careful that we are not making unfounded assumptions about how the program is executed. While computers typically perform basic arithmetic and logical operations faithfully, computer operations are limited by their implementations in hardware. For example, we think of the datatype int as representing an integer value. However, ints are bounded: there are maximum and minimum values that can be stored as ints. Basic arithmetic operations will not give the “expected” results if operations are applied that would cause a variable to exceed these values. For example, consider the following code:

1
2
int a = Integer.MAX_VALUE;
int b = a + 1;

Line 1 sets a to be the largest value that can be stored as an int (2147483647). Mathematically, we should expect that after setting int b = a + 1, the value of b is one more than the value of a. But this is not the case! After line 2 is executed b stores the value -2147483648. This example illustrates that there is a disconnect between the mathematical abstraction of integers—which comprise an infinite set of elements—and the concrete realization of integers as ints in our computer, which are necessarily bounded.

Let us now return to our proof that a method f(n) returns n * n whenever n satisfies n >= 1. Our argument is logically sound, assuming that a computer faithfully performs all logical and arithmetic operations. Yet this assumption is not always valid. For example, if n * n would represent a value larger than Integer.MAX_VALUE, then we cannot expect that the value returned by f(n) is actually n * n. Instead, our proof does establish the following: so long as the logical and arithmetic operations in our program are faithfully performed in accordance with their mathematical definitions, the value returned by f(n) is will be n * n.

This discussion is not meant to invalidate our analysis of the method f. Rather, the intent is to illustrate that we must be careful in considering underlying assumptions implicit in our purely logical arguments. Mathematical arguments apply to computer programs only insofar as the computer program faithfully performs arithmetic/logical operations in accordance with their formal specifications. Yet without formal arguments about correctness and performance, we cannot convincingly reason about programs that have not been exhaustively tested. Thus, computer scientists walk a tightrope between mathematical abstraction and concrete implementation.