Red-Black Trees
================
> module RedBlack where
> import Persistent
> import Control.Monad
> import Test.QuickCheck
> import Data.Maybe as Maybe
* A red-black tree is a binary search tree where every node is
additionally marked with a color (red or black) and in which the
following invariants are maintained:
1. The empty nodes at the leaves are black
2. The root is always black
3. From each node, every path to a leaf
has the same number of black nodes
4. Red nodes have black children
The first invariant will be true by definition, the others we will
have to maintain as we implement the tree.
Invariants
----------
* Together, these invariants imply that every red-black tree is
"approximately balanced", in the sense that the longest path to an
empty node is no more than twice the length of the shortest.
* From this, it follows that all operations will run in O(log_2 n)
time.
Implementation
--------------
The data declaration is a straightforward modification of the one for
unbalanced trees:
> data Color = R | B deriving (Eq, Show)
> data RBT a = E | N Color (RBT a) a (RBT a)
> deriving Show
For any red/black tree, we can determine its color
> color :: RBT a -> Color
> color E = B
> color (N c _ _ _) = c
We then just need to implement the method of the
Set class for this data structure.
> instance Set RBT where
The empty tree is the same as before.
> -- empty :: RBT a
> empty = E
Membership testing and listing the elements
requires just a trivial change.
> -- member :: Ord a => a -> RBT a -> Bool
> member x E = False
> member x (N _ a y b)
> | x < y = member x a
> | x > y = member x b
> | otherwise = True
> -- elements :: Ord a => RBT a -> [a]
> elements t = aux t [] where
> aux E xs = xs
> aux (N _ a y b) xs = aux a (y : aux b xs)
Insertion, is, of course a bit trickier.
> insert = bst_insert
Insertion is implemented in terms of a recursive function
ins, which walks down the tree until...
> ins :: Ord a => a -> RBT a -> RBT a
... it gets to an empty leaf node, in which case
it constructs a new (red) node containing the
value being inserted...
> ins x E = undefined
... or discovers that the value being inserted is
already in the tree, in which case it returns
the input unchanged:
> ins x s@(N c a y b)
> | x < y = undefined
> | x > y = undefined
> | otherwise = s
In the recursive calls, before returning the new tree, however, we may
need to *rebalance* to maintain the red-black invariants. The code to
do this is encapsulated in a helper function `balance`.
One final detail
----------------
We finish off the implementation by blackening the top node of the
tree to make sure that invariant (2) is always satisfied.
> bst_insert :: Ord a => a -> RBT a -> RBT a
> bst_insert x t = blacken (ins x t) where
> blacken = undefined
Balancing
---------
* The key insight in writing the balancing function is that we do not
try to rebalance as soon as we see a red node with a red
child. Instead, we return this tree as-is and only rebalance after
we've inserted the new red node between the black parent and the red
child.
* i.e., the job of the balance function is to rebalance trees with a
black-red-red path starting at the root.
* The result of rebalancing maintains the black height by converting
to a red parent with multiple black children.
* Since the root has two children and four grandchildren, there are
four ways in which such a path can happen.
[ pictures on the chalkboard ]
> balance :: Ord a => RBT a -> RBT a
> balance (N B (N R (N R a x b) y c) z d) = undefined
> balance (N B (N R a x (N R b y c)) z d) = undefined
> balance (N B a x (N R (N R b y c) z d)) = undefined
> balance (N B a x (N R b y (N R c z d))) = undefined
> balance t = t
Making Sure Invariants are Preserved
------------------------------------
We'll use quickcheck to test our tree implementation.
> instance (Ord a, Arbitrary a) => Arbitrary (RBT a) where
> arbitrary = liftM (foldr insert empty) arbitrary
Now consider the binary search tree invariants:
1. The empty nodes at the leaves are black.
> prop_Rb1 :: Bool
> prop_Rb1 = color E == B
2. The root of the tree is Black.
> prop_Rb2 :: RBT Int -> Bool
> prop_Rb2 t = color t == B
3. For all nodes in the tree, all downward paths from the
node to a leaf contain the same number of Black nodes.
> prop_Rb3 :: RBT Int -> Bool
> prop_Rb3 t = Maybe.isJust (aux t) where
> aux E = Just 1
> aux (N c l _ r) = do
> bh1 <- aux l
> bh2 <- aux r
> guard (bh1 == bh2)
> return $ case c of { R -> bh1 ; B -> bh1 + 1 }
4. All children of red nodes are black.
> prop_Rb4 :: RBT Int -> Bool
> prop_Rb4 (N col l _ r) = prop_Rb4 l && prop_Rb4 r
> && if col == R
> then (color l == B && color r == B)
> else True
> prop_Rb4 E = True
And satisfies the binary search tree condition.
> prop_BST :: RBT Int -> Bool
> prop_BST E = True
> prop_BST (N _ l x r) =
> prop_BST l && prop_BST r &&
> maybe True (x >) (tmax l) &&
> maybe True (x <) (tmin r)
> maybes :: (a -> a -> a) -> Maybe a -> Maybe a -> Maybe a
> maybes f (Just x) (Just y) = Just (f x y)
> maybes _ Nothing m = m
> maybes _ m Nothing = m
> tmin :: RBT Int -> Maybe Int
> tmin E = Nothing
> tmin (N _ l x r) =
> maybes min (tmin l) (tmin r)
> tmax :: RBT Int -> Maybe Int
> tmax E = Nothing
> tmax (N _ l x r) =
> maybes max (tmax l) (tmax r)
> rbt :: Proxy RBT
> rbt = undefined
> main :: IO ()
> main = do
> quickCheck prop_BST
> quickCheck $ prop_empty rbt
> quickCheck $ prop_insert1 rbt
> quickCheck $ prop_insert2 rbt
> quickCheck prop_Rb1
> quickCheck prop_Rb2
> quickCheck prop_Rb3
> quickCheck prop_Rb4
Notes
-----
[0] See also persistant [Java
implementation](http://wiki.edinburghhacklab.com/PersistentRedBlackTreeSet)
for comparison. Requires ~350 lines for the same implementation.
[1] Andrew Appel, ["Efficient Verified Red-Black Trees"](http://www.cs.princeton.edu/~appel/papers/redblack.pdf)
September 2011. Presents a Coq implementation of
a verified Red Black Tree.
[2] Matt Might has a blog post on the [RBT deletion operation in
Racket](http://matt.might.net/articles/red-black-delete/)