Assignment 08: Hash Tables and Maps
Hash sets and maps
Overview
In class, we considered a hash table data structure (cf. ODS Chapter 5). For this assignment, you will complete an implementation of a hash table in order to implement the SimpleUSet
interface. Additionally, you will use your hash table implementation to implement another ADT, the map ADT, as specified by the SimpleMap
interface.
Background
Hash tables
A hash table stores a collection of elements in an array, where the index of an element x
is determined according to a hash function applied to x
. More formally, given an object instance x
and an array arr
of size n
, a hash function associates a number in the range [0, 1,..., n-1]
to the object x
. The hash function must satisfy the property that if x
and y
are semantically equivalent (i.e., x.equals(y)
evalues to true
), then x
and y
hash to the same value. Moreover, this value is the same throughout the execution of the program. Good hash functions will also have the property that they “appear random:” the hash values of objects should be distributed uniformly over the range of possible values, and if x
and y
are not semantically equivalent, they should be unlikely to hash to the same value.
Even though any two particular objects x
and y
will be unlikely to collide (i.e., hash to the same value), collisions will occur. There are many ways in which one could deal with these collisions. One common method, called chaining, is to store elements associated with a given index in a linked list. That is, each index entry in the array arr
stores a reference to a Node
that is the head of a linked list. In order to find
, add
, or remove
an element x
, the corresponding operation is performed on the linked list whose head is at index getIndex(x)
in arr
.
The Map ADT
The map ADT is an abstract data type representing a mathematical function—a relation that assigns a single value to each key in a set. For example, a phone book can be thought of a map that associates a phone number (value) to each person (key) in a town. Formally, a map can be represented as a set of key-value pairs, \(S = \{(k_1, v_1), (k_2, v_2),\ldots,(k_n, v_n)\}\). Here each \(k_i\) is a key, and \(v_i\) its associated value. We require that the keys \(k_i\) are pairwise distict: for each \(i \neq j\), we have \(k_i \neq k_j\).
A map supports the following operations:
-
\(\mathrm{size}()\) returns the number of key-value pairs stored in the map.
-
\(\mathrm{isEmpty}()\) returns
true
if and only if the size of the map is \(0\). -
\(\mathrm{get}(k)\) returns the value associated with key \(k\), or
null
if the map doesn’t contain \(k\). -
\(\mathrm{put}(k, v)\) adds the value key-value pair \((k, v)\) to the map. If the map already contained a pair \((k, v')\) (i.e., with the same key), then then \((k, v')\) is removed from the map and the value \(v'\) is returned. Otherwise (if \(k\) was not previously in the map), the method returns
null
. If \(k\) is already present in the map with associated value \(v'\), this method can be thought of as reassigning \(k\)’s associated value from \(v'\) to \(v\). -
\(\mathrm{remove}(k)\) removes the (unique) key-value pair, \((k, v)\), with key \(k\) from the map (if present), and returns the associated value \(v\). If the key \(k\) was not contained in the map,
null
is returned. -
\(\mathrm{contains}(k)\) returns
true
if an only if there is pair \((k, v)\) contained in the map—i.e., \(k\) has an associated value in the map.
The map ADT is specified as the interface SimpleMap
, linked to below.
Your Task
For this assignment, you will write two programs.
-
Complete
HashUSet.java
, an implementation of theSimpleUSet
interface that represents a set as a hash table with chaining. The fileHashUSet.java
provides a partial implementation, including a constructor, agetIndex
method that defines an appropriate hash function, and a methodincreaseCapacity
that increases the capacity of the table. The program additionally defines aNode
class to be used for the linked list implementation of chaining. -
Write a class
HashSimpleMap.java
that implements theSimpleMap
interface.
Below, you are provided with tester programs for the SimpleUSet
and SimpleMap
interfaces so you can test your code.
Suggestions
-
For
HashUSet
, note that each entry in the arraytable
is initialized to be a sentinel node (head) for its associated linked list. Using sentinel nodes in this way should simplify the logic of youradd
,remove
, andfind
methods somewhat. -
For the
HashSimpleMap
, you should use aHashUSet
to store key-value pairs. To this end, you will probably want to define an auxiliary class,Pair<K, V>
that represents a pair consisting of a key (of typeK
) and a value (of typeV
). Note that if theHashUSet
containsPair
s, you will need to figure out how to search the set for a key and return the corresponding value, if any. In particular, this will require yourPair
class to have the property that if twoPair
s have the same (semantically equivalent) keys, then they should be hashed to the same index. Similarly, theHashUSet
should treat twoPair
s as equal if they have the same key. With an appropriately definedPair
class, the implementations of the required methods forHashSimpleMap
can be very short!
Code
Here is a zip archive of all the code below.
What to Submit
- Your completed
HashUSet.java
. - Your implementation
HashSimpleMap.java
Evaluation
Your submission will receive a score out of 10 points, broken down as follows:
- 5 points for
HashUSet.java
- Code compiles, runs, and passes tests in
SimpleUSetTester.java
. - Code is sensibly organized with helpful comments.
- Code compiles, runs, and passes tests in
- 5 points for
HashSimpleMap.java
- Code compiles, runs, and passes tests in
SimpleMapTester.java
. - Code is sensibly organized with helpful comments.
- Code compiles, runs, and passes tests in