> {-# LANGUAGE GADTs, FlexibleInstances, UndecidableInstances,
> ScopedTypeVariables, DeriveDataTypeable, TemplateHaskell, StandaloneDeriving,
> RankNTypes #-}
> module TDP2 where
> import Data.Char
> import Types2
> import Test.QuickCheck
> import Data.Maybe
> import Control.Monad
> import Data.Typeable
More Generic Programming
------------------------
A challenge: can we use the structures from last time to automatically
find all of the immediate subtrees of a datatype.
> class Rep a => Children a where
> children :: a -> [a]
For example, given this datatype
> data Tree a = E | N (Tree a) a (Tree a)
> deriving (Eq, Show)
> represent ''Tree
the application
> example_children = children (N (N E 'a' E ) 'b' (N E 'c' E))
should return [N E 'a' E , N E 'c' E].
This function is similar to the 'shrink' operation from
quickcheck.
> instance Rep a => Children a where
> children x = case rep of
> t0@(TIso iso tb) -> aux tb (to iso x)
> _ -> []
> where aux :: Type b -> b -> [a]
> aux (TEither t1 t2) (Left x) = aux t1 x
> aux (TEither t1 t2) (Right x) = aux t2 x
> aux (TProd t1 t2) (x1, x2) = aux t1 x1 ++ aux t2 x2
> aux t1@(TIso iso tb) x = case cast x of
> Just y -> [y]
> Nothing -> aux tb (to iso x)
> aux _ x = [] -- base types
Adding nominal type information
-------------------------------
The problem with the previous version is that we didn't store any
information about a type in `TIso` other than its structure. Some
type-directed operations need a little more than that.
Here, we will show what can be done if we add some sort of 'nominal
identity' to the structural information.
Data.Typeable and Type Dynamic
-------------------------------
GHC already supports 'nominal' type metadata with an extension called
the 'Typeable' class. We get access to this class from the module
import Data.Typeable
This typeclass contains a runtime "hash" of the name of a type. An
important thing we can do with this hash is compare it with other
hashes and see if they are the same. More importantly, if they are the
same, we should be able to convert the types of expressions from one
version to another.
~~~~~~~~~~~~~~~{.haskell}
cast :: (Typeable a, Typeable b) => a -> Maybe b
~~~~~~~~~~~~~~~
> data Dynamic = forall a. (Typeable a, Show a) => ToDyn a
> fromDyn :: Typeable b => Dynamic -> Maybe b
> fromDyn (ToDyn x) = cast x
> instance Show Dynamic where
> show (ToDyn d) = "(ToDyn " ++ show d ++ ")"
> -- a heterogenous list of values
> dynlist = [ ToDyn (0 :: Int) , ToDyn True, ToDyn "ABC", ToDyn [1::Int, 2, 3], ToDyn (4 :: Int) ]
> -- increment the value if it happens to be a number
> incr :: Dynamic -> Dynamic
> incr d = case fromDyn d of
> Just x -> ToDyn ((x :: Int) + 1)
> Nothing -> d
Adding type tags to type representations
----------------------------------------
The main change that we make is to add this hash to the
`TIso` data constructor.
At the same time, we'll use Typeable to coalesce the three base types
`TInt`, `TChar` and `TUnit`, into a single data constructor
`TBase`. All we care about these types is their name, they have no
structure.
~~~~~~~~~~~~~~~~~~~~~~~~~~~{.Haskell}
data Type a where
TBase :: Typeable a => Type a
TProd :: Type a -> Type b -> Type (a,b)
TEither :: Type a -> Type b -> Type (Either a b)
TIso :: Typeable a => Iso a b -> Type b -> Type a
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Another change is to make sure that whenever we have the structural
information about a type, we also have its hash as well.
~~~~~~~~~~~~~~~~~~~~~~~{.haskell}
class Typeable a => Rep a where
rep :: Type a
~~~~~~~~~~~~~~~~~~~~~~~
Finally, when we define new datatypes, we need to make sure that this
hash is available. We can do that easily with the 'deriveDataTypeable'
GHC extension. The 'StandaloneDeriving' extension lets us derive this
instance separate from the datatype declaration.
> deriving instance Typeable1 Tree
We also make sure that we have representations of all our old friends.
> represent ''Bool
> represent ''Maybe
> represent ''[]
Here is another fun datatype for experimentation: trees with arbitrary numbers of children at each node.
> data RoseTree a = Node a [ RoseTree a ]
> deriving (Eq, Show, Typeable)
> represent ''RoseTree
> roseTree :: RoseTree Int
> roseTree = (Node 1 [ Node 2 [ Node 5 [ Node 6 [], Node 7 [] ] ], Node 3 [], Node 4 [] ])
~~~~~~~~~~~~
Main*> children roseTree
~~~~~~~~~~~~
More examples of Generic Programming
------------------------------------
Now suppose you wanted to find *all* subtrees of a given tree, not
just the immediate predecessors.
> class Rep a => Universe a where
> universe :: a -> [a]
>
> instance Rep a => Universe a where
> universe x = case rep of
> (TIso iso tb) -> x : aux tb (to iso x)
> _ -> []
> where aux :: Type b -> b -> [a]
> aux (TEither t1 t2) (Left x) = aux t1 x
> aux (TEither t1 t2) (Right x) = aux t2 x
> aux (TProd t1 t2) (x1, x2) = aux t1 x1 ++ aux t2 x2
> aux (TIso iso tb) x = case cast x of
> Just y -> y : aux tb (to iso x)
> Nothing -> aux tb (to iso x)
> aux _ x = []
> example_universe = universe (N (N E 'a' E ) 'b' (N E 'c' E))
[N (N E 'a' E ) 'b' (N E 'c' E), (N E 'a' E ), E , E, (N E 'c' E), E , E ]
> example_u_rost = universe (Node 'a' [Node 'b' [], Node 'c' [],
> Node 'd' [Node 'e' [], Node 'f' []]])
It turns out that functions like these are fairly powerful for tree
manipulations.
Let's go back to our old friends from the 'While' programming
language:
> data Bop = Plus | Times | Minus | Div
> deriving (Eq,Show,Typeable)
> represent ''Bop
> data Expr =
> Var String
> | Val Int
> | Op Bop Expr Expr
> deriving (Eq,Show,Typeable)
> represent ''Expr
Now suppose you wanted to gather of the variables in an expression?
> vars :: Expr -> [ String ]
> vars x = [ s | Var s <- universe x ]
> example_vars =
> vars (Op Minus (Op Plus (Var "X") (Var "Y")) (Var "Z"))
Or suppose you want to find out if there are any lowercase variables
> lowerCaseVars :: Expr -> Bool
> lowerCaseVars x = any (isLower . head) [ v | (Var v) <- universe x ]
Or find out the number of times that a division by zero occurs in a
program?
countDivZero x = length [ () | Op Div _ (IntVal 0) <- universe x ]
> countDivZero :: Expr -> Int
> countDivZero x = length [ () |
> Op Div _ (Val 0) <- universe x ]
We can also define a generic transformation function:
> class Rep a => Transform a where
> transform :: (a -> a) -> a -> a
> instance Rep a => Transform a where
> transform f x = f (case rep of
> (TIso iso tb) -> from iso (aux tb (to iso x))
> _ -> x)
> where
> aux :: Type b -> b -> b
> aux (TEither t1 t2) (Left x) = Left (aux t1 x)
> aux (TEither t1 t2) (Right x) = Right (aux t2 x)
> aux (TProd t1 t2) (x1, x2) = (aux t1 x1,aux t2 x2)
> aux (TIso iso tb) x = apply f (from iso (aux tb (to iso x)))
> aux _ x = x
> apply :: (Typeable a, Typeable b) => (a -> a) -> b -> b
> apply f x = case (cast f) of
> Just f' -> f' x
> Nothing -> x
Now let's do some optimization!
> constVal :: Expr -> Expr
> constVal (Op Plus (Val i) (Val j)) = Val (i + j)
> constVal (Op Times (Val i) (Val j)) = Val (i * j)
> constVal (Op Minus (Val i) (Val j)) = Val (i - j)
> constVal (Op Div (Val i) (Val j)) | j /= 0 = Val (i `div` j)
> constVal e = e
> t1 = transform constVal (Op Plus (Var "X") (Op Times (Val 2) (Val 3)))
Note that this transformation is not the same as evaluation. Variables
are left alone.
> t2 = transform constVal (Op Plus (Var "x")
> (Op Times (Val 2) (Val 3)))
> plusZero (Op Plus e (Val 0)) = e
> plusZero (Op Plus (Val 0) e) = e
> plusZero e = e
> timesZero (Op Times e (Val 0)) = Val 0
> timesZero (Op Times (Val 0) e) = Val 0
> timesZero e = e
> timesOne (Op Times e (Val 1)) = e
> timesOne (Op Times (Val 1) e) = e
> timesOne e = e
We can string these transformations together.
> transformExpr = transform timesZero .
> transform timesOne .
> transform constVal . transform plusZero
> t3 = transformExpr (Op Plus (Var "x") (Op Times (Var "y") (Val 0)))
One snag, what if we want to transform expressions that occur
*inside* statements!?!
> data Statement =
> Assign String Expr
> | While Expr Statement
> | If Expr Statement Statement
> | Sequence Statement Statement
> | Skip
> deriving (Eq, Show, Typeable)
> represent ''Statement
Our transformer only lets us apply rewrites when the types match
up. To do better, we need to generalize the type of transform a
bit. With this extension, we can make the type of the function that we
are using for the transformation different.
> class Rep a => GTransform a where
> gtransform :: Typeable b => (b -> b) -> a -> a
> instance Rep a => GTransform a where
> gtransform f x = apply f (aux rep x) where
> aux :: Type b -> b -> b
> aux (TEither t1 t2) (Left x1) = Left (aux t1 x1)
> aux (TEither t1 t2) (Right x2) = Right (aux t2 x2)
> aux (TProd t1 t2) (x1,x2) = (aux t1 x1, aux t2 x2)
> aux (TIso iso tb) x =
> apply f (from iso (aux tb (to iso x)))
> aux _ x = x
> transformStmt :: Statement -> Statement
> transformStmt = gtransform (plusZero . timesZero)
> t4 = transformStmt (Assign "x"
> (Op Plus (Var "x") (Op Times (Var "y") (Val 0))))
Summary
-------
Steps for type-directed programming
1. When defining new types, such as datatypes or newtype, also define
the metainformation for that type: instances of the Typeable and Rep
type classes. Template Haskell helps here.
2. Type-generic operations should be defined in two steps.
a) First, define the structure-based version:
i.e. tfoo :: Type a -> ...
b) Then, use a type class and a default instance
to automatically provide the metainformation.
class Rep a => Foo a where foo :: ...
instance Rep a => Foo a where foo = tfoo rep
3. Profit! Now any represented type is automatically a member of the
new type class, and can automatically use the new operation.
Additional details
------------------
There are many more tree operations available---see libraries such as
"SYB" (Scrap your boilerplate) or "Uniplate" or "GHC.Generics" for
*many* more examples. The mechanisms for those libraries are quite
different than what is presented above---we have changed things around
to make the structural and nominal parts of the type metadata fit
together.
The process of creating structural metadata is also tedious, and can
be mechanized. Some generic programming libraries do so using
source-to-source translators (Uniplate), or GHCs metaprogramming
facility, called TemplateHaskell (RepLib), or through extensions to
GHC (SYB, GHC.Generics).
There examples presented above are still missing one important
capability---there is no way to combine specific functionality for a
type with generic functionality. For example, if you implement a 'Set'
datastructure using lists, there is no way to use generic equality
with sets.
References:
* Scrap your boilerplate: a practical approach to generic programming
Ralf Laemmel and Simon Peyton Jones, TLDI 2003
* RepLib: A Library for Derivable Type Classes
Stephanie Weirich. Haskell Workshop, Portland, OR, USA, September 2006
* Uniplate - Uniform Boilerplate and List Processing
Neil Mitchell and Colin Runciman, Haskell Workshop 2007
* A Generic Deriving Mechanism for Haskell. Jose Pedro Magalhaes,
Atze Dijkstra, Johan Jeuring, and Andres Loh, Haskell Symposium 2010.
See also
[GHC.Generics](http://www.haskell.org/haskellwiki/GHC.Generics)
General information about generic programming in Haskell:
* Comparing Libraries for Generic Programming in Haskell Alexey
Rodriguez Yakushev, Johan Jeuring, Patrik Jansson, Alex Gerdes, Oleg
Kiselyov, Bruno C. d. S. Oliviera. Haskell Symposium 2008.