# Pseudocode

a brief guide to pseudocode and conventions

Pseudocode is a way of expressing algorithms at a higher level of abstraction than code in a concrete programming language. Pseudocode is meant to be read and interpreted by humans rather than computers. Nonetheless, a pseudocode description of a procedure should be precise enough that it can be translated to a program in essentially any programming language by any programmer that is comfortable with that language.

“Pseudocode” is itself a somewhat vague term, as there is no single agreed-upon set of conventions. In *Algorithms* we will use an imperative style of pseudocode whose basic structure should be familiar to anyone who has programmed in an imperative style programming language, such as Java, Python, C, or C++. While the structure of our pseudocode should be familiar to you, pseudocode will allow you to express procedures in a manner that is more concise and easier to read than programs written in these languages. Below, we describe he basic ingredients of imperative-style pseudocode.

##### Variables, assignment, and arrays

In pseudocode, you can store and manipulate values as variables. Any string of characters that is not a keyword can be interpreted as a variable. For example `x`

and `length`

can be variable names. We will always start variable names with lower-case letters. To assign values to variables, we use the assignment operator \(\gets\) or in plain text `<-`

. For example, the following code snippet assigns the string value `"hello"`

to `x`

and the numerical value `5`

to `length`

:

1
2

x <- "hello"
length <- 5

Note that we do not need to specify the datatype of the variable, as this can be inferred from the type of the literal value assigned to the variable.

We can also define arrays of values. Literal arrays will use square bracket notation `[...]`

, and strings can be interpreted as arrays of characters. Unlike many standard programming languages, we will use the convention that **array indices start at 1**. That is, if `a`

is an array, `a[1]`

refers to the first element in the array, `a[2]`

to the second, and so on.

1
2
3
4

x <- "hello"
a <- [2, 3, 5, 7, 11]
first <- x[1]
second <- a[2]

In this example, `first`

stores the character value `'h'`

and `second`

stores the numerical value `3`

.

We can also refer to a sub-array using square brackets with double dots (`..`

).

1
2

a <- [2, 3, 5, 7, 11]
b <- a[2..4]

In this example, `b`

stores the sub-array of `a`

from indices `2`

through `4`

, namely `[3, 5, 7]`

.

##### Arithmetic operators

In our pseudocode, we will use the standard arithmetic operators for numerical values:

`+`

addition`-`

subtraction and the unary “minus” operator`*`

multiplication`/`

division`%`

modulus`^`

exponentiation

The operators `+`

, `-`

, `*`

, and `^`

have their usual interpretations for all numerical values. When the operands are integers, `/`

is interpreted as integer division (e.g., `7 / 2`

returns the value `3`

). For fractional and decimal values, `/`

denotes fractional division (e.g., `7.0 / 2`

returns the value `3.5`

). The modulus operator is only defined for integer values, and returns the remainder upon division (e.g., `7 % 2`

returns `1`

).

1
2
3
4

# assume a, b are integer values
q <- a / b # quotient
r <- a % b # remainder
c <- q * b + r # c stores value a

In the code above, we use `#`

to indicate a comment (you may also use C/Java-style `//`

). Note that it will always be the case that `c`

stores the same value as `a`

.

Expressions can also be parenthesized to specify the order of operations. The standard

##### Logical operators

In addition arithmetic operators, our pseudocode supports logical operators that return values that are `true`

or `false`

:

`=`

returns`true`

if the values are equal (semantically equivalent)- order operators
`>`

,`<`

,`>=`

(or \(\geq\)),`<=`

(or \(\leq\)) - logical connectives
`and`

,`or`

,`not`

**Exercise.** What is the value of `val`

in line 5?

1
2
3
4
5

a <- 48
b <- 2
c <- 3
val <- (a % b = 0 and a % c = 0)

##### Control flow: branching and iteration

Now we describe the syntax and semantics for conditional execution and iteration (looping). For these structures, we specify code blocks both by indentation and “end” syntax. Conditional statements can be specified using the standard if/else if/else construction:

1
2
3
4
5
6
7
8

x <- 100
if x % 3 = 0 then
x <- x / 3
else if a % 3 = 1 then
x <- x - 1
else
x <- x + 1
endif

Note that we used both indentation and the `endif`

statement to indicate a block of code. When hand writing code it is sometimes difficult to maintain consistent indentation (though graph paper can help with this). It is sometimes helpful to use vertical lines to indicate indentation as well, especially if nested statements are used.

1
2
3
4
5
6
7
8
9
10
11
12
13

if some-condition then
| do something
| if another condition then
| | do something else
| | and another thing
| else if yet another condition then
| | do something wild
| else
| | whoa now, something went wrong
| endif
else if something completely different then
| really, do this?
endif

There are four different loop structures we use for iteration:

`for`

`foreach`

`while`

`do`

-`while`

The syntax for these structures is a bit more flexible than in programming languages, and you should use whichever structure makes your pseudocode most clear. Here are four equivalent ways you could add the values of an array (Note that we assume that there is a method `size(a)`

that returns the size of the array.):

- using a
`for`

loop1 2 3 4 5

# a is an array of numerical values sum <- 0 for n = 1, 2,...,size(a) do sum <- sum + a[i] endfor

- using a
`foreach`

loop1 2 3 4 5

# a is an array of numerical values sum <- 0 foreach x in a do sum <- sum + x endfor

- using a
`while`

loop1 2 3 4 5 6 7

# a is an array of numerical values sum <- 0 n <- 1 while n <= size(a) do sum <- sum + a[n] n <- n + 1 endwhile

- using a
`do`

-`while`

loop1 2 3 4 5 6 7

# a is an array of numerical values sum <- 0 n <- 1 do sum <- sum + a[n] n <- n + 1 while n <= size(a)

##### Methods and subroutines

Finally, we describe the syntax for method calls and subroutines. We will typically name methods and subroutines using `CamelCase`

, i.e., the first letters of words in method names are capitalized. To declare a method, you can simply define its name followed by any input parameters in parentheses, followed by a colon. The body of the method should be indented. If a method returns a value, use a `return`

statement. (As in most popular programming languages, as `return`

statement halts the execution of the method and immediately returns the corresponding value.) s For example, here is a method that sums the contents of an array:

1
2
3
4
5
6
7

# Input: a, an array of numerical values
Sum(a):
sum <- 0
foreach x in a do
sum <- sum + x
endfor
return sum

Putting everything together, we can describe an implementation of an algorithm called `BubbleSort`

that sorts an array of numerical values.

1
2
3
4
5
6
7
8
9
10
11

# input: a, an array of numerical values
BubbleSort(a):
for i = 1, 2,...,size(a)-1 do
for j = 1, 2,...,size(a)-i do
if a[j] > a[j+1] then
x <- a[j+1]
a[j+1] <- a[j]
a[j] <- x
endif
endfor
endfor

Notice that the method does not return a value. Instead, `BubbleSort`

modifies the array `a`

passed into it. This is because we assume that *references* to arrays and data structures are passed in as arguments. Thus, for example, after executing

1
2

a <- [5, 2, 7, 3]
BubbleSort(a)

the variable `a`

would store the value `[2, 3, 5, 7]`

.

**Exercise.** Execute `BubbleSort(a)`

by hand on the array `a = [3, 2, 5, 1, 4]`

. Can you explain why the algorithm successfully sorts this (or any other) array?

##### Conveniences

The original version of `BubbleSort`

is fine as written, but we might want to simplify our presentation a bit. In particular, lines 6–8 are simple enough, but they are not especially descriptive. It may be more readable to simple replace these three lines with a single instruction,`swap(a, j, j+1)`

, since the effect of the block is to swap the values of `a`

stored at indices `j`

and `j+1`

. This gives:

1
2
3
4
5
6
7
8
9

# input: a, an array of numerical values
BubbleSort(a):
for i = 1, 2,...,size(a)-1 do
for j = 1, 2,...,size(a)-i do
if a[j] < a[j+1] then
swap(a, j, j+1) # swap values at indices j and j+1
endif
endfor
endfor

If it is not “obvious” how to swap two values, we could explicitly define the subroutine `swap`

with pseudocode:

1
2
3
4
5

# input: a, an array; i, j indices of a
swap(a, i, j):
x <- a[j]
a[j] <- a[i]
a[i] <- x

Finally, in analyzing the `BubbleSort`

algorithm, it may be helpful to separate out the inner loop in the pseudocode. For example, we might define

1
2
3
4
5
6
7
8
9
10
11
12

# input: a, an array of numerical values
BubbleSort(a):
for i = 1, 2,...,size(a)-1 do
Bubble(a, size(a)+1-i)
endfor
Bubble(a, i):
for j = 1, 2,...,i-1 do
if a[j] < a[j+1] then
swap(a, j, j+1)
endif
endfor

As a *program*, I prefer the implementation of `BubbleSort`

without the `Bubble`

subroutine. However, for the purposes of *analyzing* `BubbleSort`

, being able to refer to the `Bubble`

subroutine is helpful. Of course, the two implementations are equivalent, and preferences between the two may be a matter of context and/or taste.

**Exercise.** What can you say about the value `a[i]`

after a call to `Bubble(a, i)`

? How does this help explain why `BubbleSort`

successfully sorts an array?

Finally, note that none of our pseudocode has (or even supports syntax for) exception and error handling. This is by design. One of the great conveniences of pseudocode is that we don’t need to worry about erroneous inputs: we always assume (or ensure) that, e.g., indices are in bounds, etc. When our procedure does not need to account for improper inputs, we can write much more succinct descriptions. Of course, an implementation in an actual programming language should include such safeguards to avoid undesirable behavior!