Lecture 18: Heaps and Unordered Sets
Overview
 Announcements
 Finishing Heaps
 Unordered Sets
Next Week
 No coding assignment
 Take home “midterm”
 similar structure to last time
 more computational
Last Time: Heaps
Heaps…
 …are complete binary trees
 …elements satisfy heap property
Adding to Heaps
 Add new node at unique location where element can be added
 Bubble up procedure
Removing Min Element
 Copy value from unique removable leaf node to root and remove leaf
 Trickle down procedure
Representing Complete Binary Trees
Previously:
 trees represented as linked nodes
Complete binary trees have much more predictable structure
 can use an array to store complete binary trees efficiently!
Question 1
For an index $i$, what is the index of $i$’s left child? Right child?
Question 2
For an index $i$, what is the index of $i$’s parent?
Question 3
Why didn’t we use arrays to represent (non complete) binary trees?
For Homework 07
Use an array to represent heap
 implement
insert
 add element to unique location (may need to resize)
 bubble up
 implement
removeMin
 remove unique “node” and copy value to root
 trickle down
Big Picture so Far
Data Structures:
 Arrays
 usual arrays
 circular arrays
 Linked Lists
 singly linked lists
 doubly linked lists
 Trees
 binary (search) trees
 AVL trees
 complete binary trees
 heaps
ADTs
 Stack, Queue, Deque
 List
 Sets
 unordered sets
 sorted sets
 Priority Queue
Implementation Efficiency I
 Stack, Queue, Deque:
 (doubly) linked list: all ops in $O(1)$ time
 (circular) array: all ops in $O(1)$ time (amortized)
 List:
 all ops in $O(n)$ time (array & linked list)

get
in $O(1)$ time for array
Implementation Efficiency II
 Set:
 Unordered sets all ops in $O(n)$ time (array & linked list)
 Sorted Sets

find
in $O(\log n)$ time for array (binary search), others in $O(n)$
 all ops in $O(\log n)$ time for AVL tree
 Priority queue
 Binary heap:
min
in $O(1)$, insert
& removeMin
in $O(\log n)$
Sortedness
Note. For sets, efficient implementations so far have crucially relied on comparability
 comparability allows us to prescribe where a given element should be
 do not need to examine all elements to determine if a given item is (not) present
Question
How might we find
/add
/remove
when elements are not Comparable
?
One Idea
Associate a numerical value to every possible element
 numbers are comparable, so just do comparison by number
Two Issues
 How do we compute the numerical value consistently?
 What do we do about collisions?
Hashing
Idea. Given an object instance obj
, compute a numberical value from data stored in obj
 value is called a hash value or hash code
Application.
 use hash value of
obj
to determine where in data structure obj
should be stored
Goals.
 Different objects should be unlikely to have same hash value
 Should be able to specify range of possible values
 Semantically equivalent objects should have same hash value
Application: Hash Sets I
Goal. Implement unordered set ADT

add
, find
, remove
methods
Assume. have access to a hash function $h$
 for any object $x$, $h(x)$ is the hash code of $x$
 the range of values for $h$ can be specified
Application: Hash Sets II
Idea. Store elements in an array
 choose range of hash values to be $0, 1, \ldots, n1$
 to
add
, find
, remove
$x$, look at index $h(x)$
Example: Hashing Colors
$n = 6$
Uh Oh!
What do we do about collisions???
Chaining
Idea. Each entry of the array refers to the head of a linked list
 linked list at
arr[i]
stores all elements $x$ with hash values $h(x) = i$
Hash Set with Chaining
 store an array
arr
of heads of linked lists—hash table
 assume hash function
h
has range 0, 1,..., n1
w/ n = arr.length
Running Time of operations?
Bad Hash Functions
Extreme example: h(x) = 0
always!
Too Many Elements!
Array size is fixed, but keep adding elements
 What is the running time?
Resize Challange
If we resize to larger array—say size $2 n$

must update hash function $h$ to have range $0, 1, \ldots, 2n 1$

this could change hash values of elements already in hash table
Next Time
 Randomness and the Art of Hashing
 Empirical Investigation