# Simple Set ADTs

Introducing Set ADTs and Interfaces

## Mathematical Sets and Notation

A *set* represents a collection of distinct items, referred to as *elements*. We can explicitly define a set by listing its elements surrounded by curly braces. For example,

is the set of (names of) primary colors. Sets do not depend on the order in which their elements are written. For example,

\[\{\texttt{"red"}, \texttt{"blue"}, \texttt{"yellow"}\} = \{\texttt{"yellow"}, \texttt{"blue"}, \texttt{"red"}\} = \cdots.\]Further, sets do not contain duplicate elements; each element in a set is unique.

We denote the relation “is an element of”—or “contains”—using the symbol \(\in\). The negation—“not an element” or “does not contain” is denoted \(\notin\). Thus using \(S = \{\texttt{"red"}, \texttt{"blue"}, \texttt{"yellow"}\}\) as before, we have \(\texttt{"blue"} \in S\) (read “*blue* is in \(S\)”) and \(\texttt{"green"} \notin S\) (read “*green* is not in \(S\)”).

##### Semantic Equivalence and Containment

In Java (as in many programming languages), there is a distinction between the *strict* equality of objects (specified by `==`

) and *semantic equivalence* (specified by the `equals`

method—see this documentation). Given two variables referring to instances of some class, say `T`

, `var1`

and `var2`

, the statement `var1 == var2`

returns `true`

if and only if `var1`

and `var2`

refer to the same object instance. On the other hand, `var1.equals(var2)`

returns `true`

when the two instances are “semantically equivalent” in that they represent the same value in a sense defined for the class `T`

.

To make things more concrete, consider the following declarations:

1
2

Integer var1 = new Integer(2);
Integer var2 = new Integer(2);

The variables `var1`

and `var2`

refer to different `Integer`

instances, as the code above creates two `Integer`

s using the keyword `new`

. Thus, the statement `var1 == var2`

returns `false`

—the two variables do not refer to the same object instance. On the other hand, `var1`

and `var2`

are semantically equivalent in the sense that the values represented by the two `Integer`

instances are both \(2\). Therefore, we would (correctly) expect `var1.equals(var2)`

to return `true`

.

Going forward, whenever we discuss containment of sets, we will interpret \(x \in S\) to mean “\(S\) contains an element semantically equivalent to \(x\).” Continuing the `Integer`

example above, if we added `var1`

to a set \(S\), then asked if `var2`

is contained in \(S\), we would expect the result to be “yes”: \(S\) does indeed contain an `Integer`

whose value is \(2\).

**Implementation note.** When defining a new datatype (i.e., `class`

) in Java, the new class inherits the `equals()`

method from the `Object`

class. The default behavior of the `equals()`

method is equivalent to `==`

. In order to define an appropriate notion of semantic equivalence for a new class, you can override the default `equals`

method as follows:

1
2
3
4
5
6

class MyClass {
@Override
boolean equals(Object o) {
...
}
}

Note that argument to the `equals`

method is an `Object`

, and not a `MyClass`

. You can use the `instanceOf`

operator to first check that the parameter is a `MyClass`

and not some other class:

1
2
3
4

// this is not equal to o if o is not an instance of MyClass
if (! (o instanceOf MyClass)) {
return false;
}

## Simple Set ADTs

In order to define objects representing sets, we describe two simple abstract data types (ADTs) specifying some desired functionality. It is possible to define many more operations on sets, but for now, we consider only very basic operations. The first ADT, `SimpleUSet`

(simple unordered set) makes no assumptions about the types of elements being stored in a set. Basic functionality is provided to (1) test whether an element is contained in a set, (2) add an element to the set (if not already present), and (3) remove an element from the set (if present). Additionally, a `SimpleUSet`

can report its size (i.e., number of elements it contains), and whether or not it is empty.

The second ADT `SimpleSSet`

(simple sorted set) assumes that elements stored in the set can be ordered. That is, for any pair of distinct elements \(x, y \in S\), we have \(x < y\) or \(y < x\). We will see that this additional assumption on the nature of elements stored in a `SimpleSSet`

can make some implementations of the ADT more efficient. Further, `SimpleSSet`

provides additional functionality allowing one to access the smallest and largest elements (according to \(<\)).

##### Unsorted Sets

Here we formally describe the operations and effects of the `SimpleUSet`

ADT. The state \(S\) of a `SimpleUSet`

is the set of elements it contains: \(S = \{x_1, x_2, \ldots, x_n\}\). Below is a specification of the `SimpleUSet`

operations:

- \(\mathrm{size}()\):
- Return the number of elements (\(n\)) contained in the set.

- \(\mathrm{isEmpty}()\):
- Return
`true`

if \(\mathrm{size}()\) is \(0\) and`false`

otherwise.

- Return
- \(\mathrm{find}(y)\):
- If \(y \in S\), then return \(x_i \in S\) satisfying \(x_i = y\) (where we use \(=\) to denote semantic equivalence as discussed above); otherwise return
`null`

.

- If \(y \in S\), then return \(x_i \in S\) satisfying \(x_i = y\) (where we use \(=\) to denote semantic equivalence as discussed above); otherwise return
- \(\mathrm{add}(y)\):
- If \(y \in S\) (i.e., there is an element \(x_i\) in \(S\) that is semantically equivalent to \(y\)), then return
`false`

. Otherwise, update the the state of \(S\) to \(\{x_1, x_2, \ldots, x_n, y\}\) and return`true`

.

- If \(y \in S\) (i.e., there is an element \(x_i\) in \(S\) that is semantically equivalent to \(y\)), then return
- \(\mathrm{remove}(y)\):
- If \(y \in S\)—say \(y = x_i\)—then return \(x_i\) and update the state to \(\{x_1, x_2, \ldots, x_{i-1}, x_{i+1}, \ldots, x_{n}\}\). If \(y \notin S\) then return
`null`

.

- If \(y \in S\)—say \(y = x_i\)—then return \(x_i\) and update the state to \(\{x_1, x_2, \ldots, x_{i-1}, x_{i+1}, \ldots, x_{n}\}\). If \(y \notin S\) then return

**Exercise 1.** While sets are distinct from lists, it is possible to implement the functionality of a set using a list. How could you represent a `SimpleUSet`

as a `SimpleList`

? How would you implement the operations \(\mathrm{find}\), \(\mathrm{add}\), and \(\mathrm{remove}\) using the list operations?

##### Sorted Sets

We now consider sets of elements that can be compared according to a “natural order” on the elements. That is, there is a notion of “less than” (formally a *binary relation*), denoted \(<\) such that given any two elements \(x, y\), then precisely one of the following holds:

- \(x < y\),
- \(y < x\),
- \(x = y\) (semantic equivalence).

We also assume transitivity of \(<\): if \(x < y\) and \(y < z\), then \(x < z\). Given a set \(S\) of comparable elements, we can write \(S = \{x_1, x_2, \ldots, x_n\}\) where \(x_1 < x_2 < \cdots < x_n\).

The `SimpleSSet`

ADT (simple sorted set) extends the `SimpleUSet`

ADT, under the assumption that all elements in a `SimpleSSet`

can be compared according to some ordering \(<\). The \(\mathrm{size}()\), \(\mathrm{isEmpty}()\), and \(\mathrm{remove}()\) methods are precisely the same as `SimpleUSet`

. The \(\mathrm{add}(y)\) method has the same effect in `SimpleSSet`

as `SimpleUSet`

as well, except the element \(y\) is added “in sorted order” (if \(y \notin S\) before the operation). The \(\mathrm{find}(y)\) method is different from `SimpleSSet`

than the specification for `SimpleUSet`

. `SimpleSSet`

also has two additional methods, \(\mathrm{findMin}()\) and \(\mathrm{findMax}()\), specified below. Again, the state of a `SimpleSSet`

is an ordered set \(S = \{x_1, x_2, \ldots, x_n\}\) where we have \(x_1 < x_2 < \cdots < x_n\).

- \(\mathrm{find}(y)\):
- If \(x_n < y\), then return
`null`

. Otherwise return the smallest \(x_i\) satisfying \(y \leq x_i\) (i.e, \(y = x_i\) or \(y < x_i\)).

- If \(x_n < y\), then return
- \(\mathrm{findMin}()\):
- Return \(x_1\) or
`null`

if the set is empty.

- Return \(x_1\) or
- \(\mathrm{findMax}()\):
- Return \(x_n\) or
`null`

if the set is empty.

- Return \(x_n\) or

**Implementation note.** In Java, we specify an ordering on elements of a class by implementing the `Comparable`

interface. See the complete documentation here. The `Comparable<T>`

interface requires that we implement a single method `int compareTo(T o)`

. The interpretation is as follows:

`x.compareTo(y)`

returning a negative number is interpreted as \(x < y\).`x.compareTo(y)`

returning a positive number is interpreted as \(y < x\).`x.compareTo(y)`

returning`0`

is interpreted as \(x = y\) (semantic equivalence).- The
`compareTo`

method should always be implemented in such a way that`x.compareTo(y)`

returns`0`

if and only if`x.equals(y)`

.

- The

The `Comparable`

documentation shows that many built-in classes in Java already implement `Comparable`

. Notably, all numerical classes (`Integer`

, `Double`

, etc) and `String`

already implement the interface.

In writing a Java interface that specifies the `SimpleSSet`

ADT, we would like to use generic types, such as

1

public interface SimpleSSet<E> { ... }

However, it must be the case that type `E`

supports comparisons using the `compareTo`

method. That is, `E`

must implement the `Comparable`

interface. In order to enforce this requirement, we declare:

1

public interface SimpleSSet<E extends Comparable<E>> { ... }

With this declaration, we can use our `SimpleSSet`

to store a set of elements of any type `E`

, so long as `E`

implements `Comparable<E>`

.