Binary Search Trees vs. Hash Tables

Goals


Interface java.util.Map:


Interface java.util.SortedMap extends java.util.Map:

The SortedMap interface provides additional methods that permits access to subsets of the current map, defined by intervals in the key-space.

Additional functions:


Binary Search Tree based Map: java.util.TreeMap

Binary search tree (BST) implementation of the Map interface, specifically using Red-Black trees. Also implements the SortedMap interface (See above). Note that in the TreeMap class the methods keySet() and values() return ordered sets of the keys and values as opposed to random ordering (As in HashMap).

Complexity of common functions


Hash Table based Map: java.util.HashMap

Hash table implementation of Map interface. Collisions are handled by separate chaining using a simply linked list. The hash function is defined by the key class (Supplemented with some bit juggling). The compression function is mod N with N the capacity of the table. This implementation doubles the size of the array whenever the load factor reaches a given threshold (Default 0.75).

Complexity of common functions

Where d = max number of collisions and N = capacity of hash table.


Class java.util.LinkedHashMap:

A LinkedHashMap is the same as a regular HashMap with the addition of a doubly-linked list between all the entires. This doubly-linked list orders the entries in the hash table by insertion or access order. These can be maintained in constant time so add little overhead to the HashMap implementation.

Insertion order: Signifies that the collections returned by the keySet() and values() method iterates over elements in the same order as they were inserted into the map. Note that insertion order is not affected by an update (i.e. calling put() on an existing key).
Access order: Signifies that the collections returned by keySet() and values() method iterates over elements in the order they were accessed from least recent to most recent. An access is defined by either a call to get() or a call to put().

Note that the use of a doubly linked list to connect all entries changes the running time of the following methods:


Interfaces java.util.Set and java.util.SortedSet:

A Set is a simplified Map. Recall that a Map is a collection of key, value pairs. A Set on the other hand is a collection of keys only. Like a Map, the keys in a Set are unique and cannot be repeated. The two interfaces are very similar, so much so that the primary Java implementations of the Set and SortedSet interfaces, HashSet and TreeSet, are actually adapters of HashMap and TreeMap!


Programming Exercise:

We'll use SortedSet (or SortedMap) to calculate some basic statistics from a list of Grade objects. Implement the following methods in class called GradeStats:

What are the expected running time of the above query methods?

The file GradeStats.java is a stub file you can use. Also included is GradeGen which can generate random lists for you to test your code on.

The full API is available here.