Persistent Data Structures
==========================
Starring: Red-Black trees
> {-# LANGUAGE KindSignatures, ScopedTypeVariables #-}
> module Persistent where
> import Control.Monad
> import Test.QuickCheck hiding (elements)
> import Data.Maybe as Maybe
> import Data.List (sort,nub)
Persistent vs. Ephemeral
------------------------
* An *ephemeral* data structure is one for which only one version is
available at a time: after an update operation, the structure as it
existed before the update is lost.
* A *persistent* structure is one where multiple version are
simultaneously accessible: after an update, both old and new versions
are available.
Functional programming is adept at implementing persistant data
structures. In particular, datatypes and pattern matching make the
implementation of persistent tree-like data structures remarkably
straightforward.
A Set interface
===============
Let's think about what the interface to a persistent set should look
like. We can tell that this implementation is persistent just by
looking at the types of the operations.
> class Set s where
> empty :: s a
> member :: Ord a => a -> s a -> Bool
> insert :: Ord a => a -> s a -> s a
> elements :: Ord a => s a -> [a]
For example, one trivial implement of sets is in terms of lists.
> instance Set [] where
> empty = []
> member = elem
> insert = (:)
> elements = sort . nub
When we define an abstract data structure like `Set` above, we should
also specify properties that *all* implementations should satisfy.
For each of these properties, we will use a `Proxy` argument to tell
quickcheck exactly which implementation it should be testing. We could
use a type annotation instead (except for `prop_empty`) but the `Proxy`
argument is a little bit easier to use.
> data Proxy (s :: * -> *) = Proxy
For example, we can define a proxy for the list type.
> list :: Proxy []
> list = Proxy
The empty set has no elements.
> prop_empty :: forall s. (Set s) => Proxy s -> Bool
> prop_empty _ = null (elements (empty :: s Int))
The elements of the set are sorted, and all of them
are stored in the tree.
> prop_elements :: (Set s) => Proxy s -> s Int -> Bool
> prop_elements _ x = elements x == nub (sort (elements x)) &&
> all (\y -> member y x) (elements x)
When we insert an element in the tree, we want to make sure that
it is contained in the result.
> prop_insert1 :: (Set s) => Proxy s -> Int -> s Int -> Bool
> prop_insert1 _ x t = member x (insert x t)
And that the new tree also contains all of the original elements.
> prop_insert2 :: (Set s) => Proxy s -> Int -> s Int -> Bool
> prop_insert2 _ x t = all (\y -> member y t') (elements t) where
> t' = insert x t
Binary Search Trees
===================
See [Binary Search Trees](BST.html) and their implementation [BST.lhs](BST.lhs)
Balanced Trees
==============
* If our sets grow large, we may find that the simple binary tree
implementation is not fast enough: in the worse case, each insert or
member operation may take O(n) time!
* We can do much better by keeping the trees balanced.
* There are many ways of doing this. Letâ€™s look at one fairly simple
(but still very fast) one that you have probably seen before in an
imperative setting: *red-black trees*.
Red-Black Trees
----------------
See [Red Black Trees](RedBlack.html) and their
implementation [RedBlack.lhs](RedBlack.lhs).