Binary Search Trees vs. Hash Tables

Goals


Interface java.util.Map:


Interface java.util.SortedMap extends java.util.Map:

The SortedMap interface provides additional methods that permits access to subsets of the current map, defined by intervals in the key-space.

Additional functions:


Binary Search Tree based Map: java.util.TreeMap

Binary search tree (BST) implementation of the Map interface, specifically using Red-Black trees. Also implements the SortedMap interface (See above). Note that in the TreeMap class the methods keySet() and values() return ordered sets of the keys and values as opposed to random ordering (As in HashMap).

Complexity of common functions


Hash Table based Map: java.util.HashMap

Hash table implementation of Map interface. Collisions are handled by separate chaining using a simply linked list. The hash function is defined by the key class (Supplemented with some bit juggling). The compression function is mod N with N the capacity of the table. This implementation doubles the size of the array whenever the load factor reaches a given threshold (Default 0.75).

Complexity of common functions

Where d = max number of collisions and N = capacity of hash table.


Interfaces java.util.Set and java.util.SortedSet:

A Set is a simplified Map. Recall that a Map is a collection of key, value pairs. A Set on the other hand is a collection of keys only. Like a Map, the keys in a Set are unique and cannot be repeated. The two interfaces are very similar, so much so that the primary Java implementations of the Set and SortedSet interfaces, HashSet and TreeSet, are actually adapters of HashMap and TreeMap!


Programming Exercise:

In this lab we will work on problem 1 of Homework 5, implementing a WebpageIndexEntryI. At the end of the lab make sure to submit WebpageIndexEntry.java, your implementation does not need to be finished when you turn it in. If you have already finished this problem feel free to use the lab time to keep working on Homework 5.

Begin by creating WebpageIndexEntry.java, the class that implements WebpageIndexEntryI. Make sure to include a default constructor.

Internally, a WebpageIndexEntryI keeps a String with the address of the webpage, a Set of links the page links to and a List of keywords. Create fields in your class for these, initialize them in the constructor and implement the following methods (i.e. all but getFrequencyMap()):

The method getFrequencyMap() converts the List of keywords to a Map where the keys are the number of times a word appears in the keyword list and for each key K, the value is the set of words that appear K number of times in the list of keywords.

For example:
if the list of keywords is The map will be
  • Cat
  • Dog
  • Fish
  • Cat
  • Bird
  • Cat
  • Dog
  • Fish
Key Value
3 Cat
2 Dog, Fish
1 Bird

There are many ways to achieve this, one way involves using a temporary map from keywords to the frequency of the keywords. In our example the map would be:

Key Value
Cat 3
Dog 2
Fish 2
Bird 1

Once this temporary map is created we can scan each key in the map and insert the keyword into the right place on the final map.

Implement getFrequencyMap(). You may use the temporary map implementation described here or any other implementation you would like. When choosing which Map to instantiate keep in mind the running time of each operation and how you intend to use the Map.

When you hand in your homework make sure to justify why you chose the type of Maps that you did.