`{-# LANGUAGE InstanceSigs #-}`

# Binary Search Trees

One very simple implementation for sets is in terms of binary search trees...

`module BST where`

`import Persistent`

```
import Test.QuickCheck hiding (elements)
import System.Random (Random)
import Control.Monad (liftM, liftM2, liftM3)
import Data.List hiding (insert,delete)
import Data.Monoid
```

A binary search tree is a binary tree that stores data values at nonempty nodes.

```
data BST a = E -- empty
| N (BST a) a (BST a) -- nonempty
deriving (Eq, Show)
```

Not only does our implementation have to satisfy the invariants of the set interface, but it must also satisfy the *binary search tree invariant*.

We will state that invariant as a quickcheck property. Essentially, for every nonempty tree, the maximum value of the left subtree must be less than the value, and the minumum value of the right subtree must be greater than the value at that node.

```
prop_BST :: BST Int -> Bool
prop_BST t = isSortedNoDups (elts t)
```

```
isSortedNoDups :: Ord a => [a] -> Bool
isSortedNoDups x = nub (sort x) == x
```

```
elts :: BST a -> [a]
elts t = aux t [] where
aux E acc = acc
aux (N l v r) acc = aux l (v : aux r acc)
```

We can implement the binary search tree operations in a fairly straightforward way.

`instance Set BST where`

The empty tree is just an empty node

```
empty :: BST a
empty = E
```

The binary search tree invariant means that we do not need to examine the entire tree when we are looking to see if an element in contained in the set.

```
member :: Ord a => a -> BST a -> Bool
member x E = False
member x (N l v r) | x == v = True
| x < v = member x l
| otherwise = member x r
```

We can list all of the elements of the tree by walking over it.

```
elements :: Ord a => BST a -> [a]
elements = elts
```

However, when we insert an element into the tree, we must make sure that the invariant is maintained. This means finding exactly the right spot to insert the new element.

```
insert :: Ord a => a -> BST a -> BST a
insert x E = (N E x E)
insert x t@(N l v r) | x == v = t
| x < v = N (insert x l) v r
| otherwise = N l v (insert x r)
```

Once we implement binary search trees, we should verify that our implementation satisfies the appropriate properties.

```
prop_insert_preserves :: Int -> BST Int -> Bool
prop_insert_preserves x t = prop_BST (insert x t)
```

Using insert and empty, we can generate arbitrary binary search trees by generating arbitrary lists. However, that means we have to get insert correct before anything else.

```
instance (Ord a, Arbitrary a) => Arbitrary (BST a) where
arbitrary = liftM (foldr insert empty) arbitrary
```

Alternately, we can generate arbitrary binary search trees directly.

```
{-
instance (Ord a, Bounded a, Random a, Num a, Arbitrary a) => Arbitrary (BST a) where
arbitrary = gen 0 100 where
gen :: (Ord a, Num a, Random a) => a -> a -> Gen (BST a)
gen min max | (max - min) <= 3 = return E
gen min max = do
elt <- choose (min, max)
frequency [ (1, return E),
(6, liftM3 N (gen min (elt - 1))
(return elt) (gen (elt + 1) max)) ]
-}
```

## EXTRA: deletion

We don't have time to cover this implementation in class, but for reference, the complete code for binary search tree deletion is below.

The main loop for deletion first finds the appropriate node in the tree to delete. If at least one of the children are empty, then deletion is straightforward. However, if both are nonempty, then the removed element must be replaced with the largest element from the left subtree. (Or, alternatively, the smallest element of the right subtree.)

```
delete :: Ord a => a -> BST a -> BST a
delete x E = E
delete x (N a y b) | x < y = N (delete x a) y b
| x > y = N a y (delete x b)
| x == y =
case (a,b) of
(E,_) -> b
(_,E) -> a
(N _ _ _,_) -> N a' z b where
(z,a') = deleteMax a
```

This function takes a nonempty tree, removes its largest element (which is simple, because that element has no right child) and returns both the new tree and that element.

```
deleteMax :: Ord a => BST a -> (a, BST a)
deleteMax E = error "Impossible case"
deleteMax (N a y E) = (y, a)
deleteMax (N a y b) = (x, N a y b') where
(x,b') = deleteMax b
```

```
prop_delete :: Int -> [Int]-> Bool
prop_delete x xs =
not (member y (delete y (foldr insert empty xs))) where
y = xs !! z
z = x `mod` length xs
```

```
bst :: Proxy BST
bst = Proxy
```

```
main :: IO ()
main = do
quickCheck prop_BST
quickCheck $ prop_empty bst
quickCheck $ prop_insert1 bst
quickCheck $ prop_insert2 bst
quickCheck prop_insert_preserves
quickCheck prop_delete
```

## News :

Welcome to CIS 552!

See the home page for basic
information about the course, the schedule for the lecture notes
and assignments, the resources for links to the required software
and online references, and the syllabus for detailed information about
the course policies.