Binary Search Trees
===================
One very simple implementation for sets is in terms of binary search
trees...
> module BST where
> import Persistent
> import Test.QuickCheck hiding (elements)
> import System.Random (Random)
> import Control.Monad (liftM, liftM2, liftM3)
> import Data.List hiding (insert,delete)
> import Data.Monoid
A binary search tree is a binary tree that stores data values
at nonempty nodes.
> data BST a = E -- empty
> | N (BST a) a (BST a) -- nonempty
> deriving (Eq, Show)
Not only does our implementation have to satisfy the invariants of the
set interface, but it must also satisfy the *binary search tree
invariant*.
We will state that invariant as a quickcheck property. Essentially,
for every nonempty tree, the maximum value of the left subtree must
be less than the value, and the minumum value of the right subtree
must be greater than the value at that node.
> prop_BST :: BST Int -> Bool
> prop_BST E = True
> prop_BST (N l x r) = undefined
> tree_min :: BST Int -> Maybe Int
> tree_min E = Nothing
> tree_min (N l x r) =
> maybes min (tree_min l) (tree_min r)
> tree_max :: BST Int -> Maybe Int
> tree_max E = Nothing
> tree_max (N l x r) =
> maybes max (tree_max l) (tree_max r)
> maybes :: (a -> a -> a) -> Maybe a -> Maybe a -> Maybe a
> maybes f (Just x) (Just y) = Just (f x y)
> maybes _ Nothing m = m
> maybes _ m Nothing = m
We can implement the binary search tree operations in a fairly
straightforward way.
> instance Set BST where
The empty tree is just an empty node
> -- empty :: BST a
> empty = E
The binary search tree invariant means that we do not need to
examine the entire tree when we are looking to see if an
element in contained in the set.
> -- member :: Ord a => a -> BST a -> Bool
> member x E = undefined
We can list all of the elements of the tree by walking over it.
> -- elements :: Ord a => BST a -> [a]
> elements E = []
> elements (N a y b) = elements a ++ [y] ++ elements b
However, when we insert an element into the tree, we must
make sure that the invariant is maintained. This means
finding exactly the right spot to insert the new element.
> -- insert :: Ord a => a -> BST a -> BST a
> insert = undefined
Once we implement binary search trees, we should verify that
our implementation satisfies the appropriate properties.
> prop_insert_preserves :: Int -> BST Int -> Bool
> prop_insert_preserves x t = prop_BST (insert x t)
Using insert and empty, we can generate arbitrary binary search trees
by generating arbitrary lists. However, that means we have to get
insert correct before anything else.
> -- instance (Ord a, Arbitrary a) => Arbitrary (BST a) where
> -- arbitrary = liftM (foldr insert empty) arbitrary
Alternately, we can generate arbitrary binary search trees directly.
> instance (Ord a, Bounded a, Random a, Num a, Arbitrary a) => Arbitrary (BST a) where
> arbitrary = gen 0 100 where
> gen :: (Ord a, Num a, Random a) => a -> a -> Gen (BST a)
> gen min max | (max - min) <= 3 = return E
> gen min max = do
> elt <- choose (min, max)
> frequency [ (1, return E),
> (6, liftM3 N (gen min (elt - 1))
> (return elt) (gen (elt + 1) max)) ]
---------------------------------------------
EXTRA: deletion
---------------
We don't have time to cover this implementation in class,
but for reference, the complete code for binary search
tree deletion is below.
The main loop for deletion first finds the appropriate
node in the tree to delete. If at least one of the children are
empty, then deletion is straightforward. However, if both
are nonempty, then the removed element must be replaced with the
largest element from the left subtree. (Or, alternatively, the smallest
element of the right subtree.)
> delete :: Ord a => a -> BST a -> BST a
> delete x E = E
> delete x (N a y b) | x < y = N (delete x a) y b
> | x > y = N a y (delete x b)
> | x == y =
> case (a,b) of
> (E,_) -> b
> (_,E) -> a
> (N _ _ _,_) -> N a' z b where
> (z,a') = deleteMax a
This function takes a nonempty tree, removes its
largest element (which is simple, because that element
has no right child) and returns both the new tree and
that element.
> deleteMax :: Ord a => BST a -> (a, BST a)
> deleteMax E = error "Impossible case"
> deleteMax (N a y E) = (y, a)
> deleteMax (N a y b) = (x, N a y b') where
> (x,b') = deleteMax b
> prop_delete :: Int -> [Int]-> Bool
> prop_delete x xs =
> not (member y (delete y (foldr insert empty xs))) where
> y = xs !! z
> z = x `mod` length xs
> bst :: Proxy BST
> bst = undefined
> main :: IO ()
> main = do
> quickCheck prop_BST
> quickCheck $ prop_empty bst
> quickCheck $ prop_insert1 bst
> quickCheck $ prop_insert2 bst
> quickCheck prop_insert_preserves
> quickCheck prop_delete