**CIS 1920 Homework 2: Binary Search Trees 🌳**
**Due February 10, 2023 11:59 pm EST**
(#) Learning objectives - Familiarization with implementing *Pythonic* classes - Familiarization with magic/dunder functions: `__iter__`, `__contains__`, `__len__` - Work with generator `yield` statements - Work with basic `Exception` handling (#) Starter files - [`hw2.py`](hw2.py) (#) Binary Search Trees Binary search trees (BSTs) are a popular data structure you may have seen for storing items when fast search is desired functionality. As the name suggests, each node in a BST can have at most two children (hence binary), with the central property of the tree being sorted: in order to be a valid BST, *the key in each node must be greater than or equal to any key stored in the left sub-tree, and less than or equal to any key stored in the right subtree*. This property allows for finding keys quickly: on average, finding a given key takes $O(log(n))$ time, where $n$ is the total number of items in the BST. However, the worst case for finding a key is $O(n)$ if the tree is imbalanced. The `search()` runtime depends on the order in which the keys were inserted. Pictured below are two valid BSTs with the same keys: ******************************************* * .-. .-. * | 3 | | 2 | * +-+ +-+ * / \ \ * .+. .+. .+. *| 2 | | 5 | | 3 | * '-' +-+ '-+ * / \ * .+. .+. * | 4 | | 4 | * '-' '-+ * balanced imbalanced \ * .+. * | 5 | * '-' ******************************************* (#) BSTs in Python We'll implement BSTs in Python by writing two classes, `Node` and `BST`. To simplify things, we will only use `int` values as keys with no duplicate keys allowed. In fact, we will have `Node.insert()` raise an exception if we try to add a duplicate key to the BST. (##) Insert and search The `Node` class will do most of the heavy lifting, implementing the two key functions `insert()` and `search()`. The `BST` class you will implement holds a reference to the root node and wraps the `Node` functions. In words, `Node.search()` checks whether the current node contains the key we're searching for. If it does, then we return the current node. Otherwise, if the key we're searching for is less than the key at the current node, we'll search the left subtree, and if it is greater than the current key, we'll search the right subtree. ************************************************************************ * .-. .-. .-. * | 3 | | 3 | | 3 | * +-+ +-+ +-+ * / \ / \ / \ * .+. .+. .+. .+. .+. .+. * | L | | R | | L | | R | | L | | R | * +-+ +-+ +-+ +-+ +-+ +-+ * / \ / \ / \ / \ / \ / \ * * key == 3? We're done! key < 3? Search L key > 3? Search R * subtree subtree ************************************************************************ Similarly, `Node.insert()` checks whether the current node contains the key we want to insert, and raises an Exception if it does since we're assuming no duplicates. Otherwise, if the key is less than the current node, we either create a new `Node` instance and insert the key to the left of the current node (if it doesn't have a left child) or we recursively try inserting again in the left subtree. The same applies for the right subtree if the key is greater than the current node. (##) Magic functions and traversal We'll also implement some magic functions for our BSTs. Like what we saw in lecture, with functions like `__iter__` and `__contains__` implemented we can use the nice syntactic sugar Python provides: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Python linenumbers bst = BST() bst.insert(3) bst.insert(2) bst.insert(5) bst.insert(4) # uses __contains__ magic if 4 in bst: print("We found 4!") # uses __iter__ magic for node in bst: print(node) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For iteration, we will use **in-order traversal.** This means that, beginning at the root node, we will first recursively traverse the left subtree, return the root node, then recursively traverse the right subtree. Because of the properties of BSTs, this will result in printing all the keys in ascending order. Pictured below is an example, yielding a traversal of $2,3,5$. ************************************************************************ * .-. .-. .-. * | 3 | | 3 | | 3 | * +-+ +-+ +-+ * / \ ---> / \ ---> / \ * .+. .+. .+. .+. .+. .+. * | 2 | | 5 | | 2 | | 5 | | 2 | | 5 | * '-' '-' '-' '-' '-' '-' * Recursively traverse Traverse node 3 Recursively traverse * subtree rooted at 2 subtree rooted at 5 ************************************************************************ (##) Implementation details You will implement the following functions: - `Node.__init__()`: initializes the attributes of the Node class. - `BST.__init__()`: initializes the attributes of the BST class. - `Node.insert()`: recursively or iteratively inserts a key, raises an exception for duplicates. - `Node.search()`: recursively or iteratively searches for a key and returns the corresponding Node, if it exists. !!! Tip For code style this assignment, we will be performing a close read of `Node.search()` - `BST.insert()`: wraps the `Node.insert()` function. - `BST.search()`: wraps the `Node.search()` function. - `BST.__len()__`: magic function that returns the number of elements in the BST. - `BST.__contains__()`: magic function to check whether a key is present within the BST. - `BST.__iter__()`: magic function that iterates over the BST with **in-order traversal**, described above. Function signatures and additional details can be found in the documentation of the starter file. (##) Other Requirements !!! WARNING No libraries may be imported for this homework. (##) Common Gotchas - Remember that when you are making an attribute assignment to use `self.foo`, `foo` by itself will be a local variable! (#) Rubric | Section | Points | |---------|--------| Name, PennKey, and hours filled in | 0.5 `Node.__init__()` and `BST.__init__()` | 0.5 `Node.insert()`| 2 `Node.search()`| 2 `BST.__len__()` (see note) | 1 `BST.__contains__()` (see note) | 1 `BST.__iter__()` | 2 `Node.search()` code style | 1 **Total** | 10 **Note**: Our test cases for `BST.__len__` will implicitly test your `BST.insert` function, while test cases for `BST.__contains__` implicitly test both `BST.search` and `BST.insert`, so make sure those are implemented before testing! (#) Testing your code and submission You can test your code using `__main__` to check the behavior of your implementation on your local machine. You will upload your `hw2.py` code to [**Gradescope**](https://www.gradescope.com/courses/477992) for submission. We encourage you to work iteratively, implementing functions one at a time to verify their correctness before moving on to the next function. To facilitate this, you are welcome to submit to Gradescope to verify your code against the autograder as many times as you would like before the submission due date without penalty. Please keep in mind that any submission made **after the due date** will be considered late and will either be counted towards your alloted late days or penalized accordingly.