Type-based Reflection
=====================
> {-# LANGUAGE GADTs, FlexibleInstances,
> UndecidableInstances, ScopedTypeVariables, TemplateHaskell,
> DeriveDataTypeable #-}
> module TDP where
> import Data.Bits
> import Test.QuickCheck
> import Types
> import Data.Typeable
> import State
> import Control.Monad
Some functions work based on the *structure* of types. This is a form
of "reflection", where metadata about programs (i.e. types) is encoded
as program data that can be examined at runtime.
You've already seen examples of such functions. For example,
structural equality, written `==` in Haskell, is one such
function. Haskell can derive instances of equality for standard
datatypes, and it does it by looking at the structure of the type.
However, equality does not need to be built in. You could define how
to derive it yourself. Suppose we implemented our own version of
structural equality.
> class Teq a where
> eq :: a -> a -> Bool
Then given any built in type.
> data Tree a = E | N (Tree a) a (Tree a)
Then we should be able to use equality for this type.
> t1 = eq (N E 'a' E) (N E 'b' E) -- False
> t2 = eq (N E 'b' E) (N E 'b' E) -- True
> t3 = eq (N E 'a' E) E -- False
> t4 = eq (N (N E 'a' E) 'b' (N E 'c' (N (N E 'd' E) 'e' (N E 'f' E))))
> (N (N E 'a' E) 'b' (N E 'c' (N (N E 'd' E) 'e' (N E 'f' E))))
> -- True
> t5 = eq (N (N E 'a' E) 'b' (N E 'c' (N (N E 'd' E) 'e' (N E 'f' E))))
> (N (N E 'a' E) 'b' (N E 'c' (N (N E 'g' E) 'e' (N E 'f' E))))
> -- False
The secret trick is to construct a *representation of the type*. We'll
use Template Haskell to do that automatically. The TH code only looks
at the declaration of the tree type---it does not know anything about
how equality works.
> represent ''Tree
Once we create the metadata, then we can define a *generic* equality
function that uses that metadata for structural operations.
But what does that meta data look like in the first place?
Reifying types
--------------
We use a GADT to reify types. This datatype includes data
constructors for the base types, such as `TInt`, `TChar` and `TUnit`,
as well as compound combinators for product types and `Either` types.
It's defined in the [Types](Types.hs) module, so we only summarize it
here.
~~~~~~~~~~~~~~~~~{.haskell}
data Type a where
TInt :: Type Int
TChar :: Type Char
TUnit :: Type ()
TProd :: Type a -> Type b -> Type (a,b)
TEither :: Type a -> Type b -> Type (Either a b)
TIso :: Iso a b -> Type b -> Type a
~~~~~~~~~~~~~~~~~~
For example, the type (Int, Char) can be represented as
> x :: Type (Int, Char)
> x = TProd TInt TChar
The last data constructor `TIso` is used to represent types that are
*isomorphic* to representable types. We don't want to have to extend
this datatype with a new data constructor every time we create a new
datatype (and, even worse extend all functions that work over this
datatype.) Instead, to represent any other type, we construct an
isomorphism between it and a type that we already know how to handle.
An isomorphism is just a way to map values of one type to another and
back.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.haskell}
data Iso a b = Iso { to :: a -> b, from :: b -> a }
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
An important property about an isomorphism is that the 'from' and
'too' functions are inverses of eachother.
> prop_Iso1 :: Eq a => Iso a b -> a -> Bool
> prop_Iso1 iso x = from iso (to iso x) == x
> prop_Iso2 :: Eq b => Iso a b -> b -> Bool
> prop_Iso2 iso x = to iso (from iso x) == x
For example, we can construct an isomorphism between the type `Bool`
and the type `Either () ()`:
> boolIso :: Iso Bool (Either () ())
> boolIso = Iso t f where
> t :: Bool -> Either () ()
> t True = Right ()
> t False = Left ()
> f :: Either () () -> Bool
> f (Right ()) = True
> f (Left ()) = False
Don't forget to check these properties!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.haskell}
Main*> quickCheck $ prop_Iso1 boolIso
Main*> quickCheck $ prop_Iso2 boolIso
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This isomorphism allows us to represent the structure of
the boolean type.
> tBool :: Type Bool
> tBool = TIso boolIso (TEither TUnit TUnit)
Datatype-Generic Equality
-------------------------
An example of a structure-defined operation is equality.
Instead of using `derive Eq`, we can use the structure of the type
to determine how the equality function should behave.
> teq :: Type a -> a -> a -> Bool
> teq TInt x y = x == y
> teq TChar x y = x == y
> teq TUnit x y = x == y
> teq (TProd t1 t2) (x1,x2) (y1,y2) = teq t1 x1 y1 && teq t2 x2 y2
> teq (TEither t1 t2) (Left x) (Left y) = teq t1 x y
> teq (TEither t1 t2) (Right x) (Right y) = teq t2 x y
> teq (TEither t1 t2) _ _ = False
> teq (TIso iso t) x y = teq t (to iso x) (to iso y)
> -- tBool :: Type Bool
> -- tBool = TIso (Iso id id) tBool
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.haskell}
*Main> teq (TProd tBool tBool) (True, False) (True, False)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Whenever we want to use this function, we have to first figure out the appropriate meta-information to supply. For example,
> example =
> teq tBool True False &&
> teq (TProd TInt (TProd TChar tBool))
> (3, ('a', True)) (4, ('a', True))
However, Haskell can automatically determine this information from the
type of the second and third arguments (which it infers.) Therefore
we can use a type class to store and propagate reflected type
information.
~~~~~~~~~~~~~~~~~~~{.haskell}
class Rep a where
rep :: Type a
instance Rep Int where rep = TInt
instance Rep Char where rep = TChar
instance Rep () where rep = TUnit
instance (Rep a, Rep b) =>
Rep (Either a b) where rep = TEither rep rep
instance (Rep a, Rep b) =>
Rep (a, b) where rep = TProd rep rep
~~~~~~~~~~~~~~~~~~~
For every 'TIso', we'll need to add a new instance to this
type class. This instance is just the type representation.
> instance Rep Bool where rep = tBool
Now, we turn all representable types into equality types by supplying
the represention to the type-directed equality function.
> instance (Rep a) => Teq a where
> -- eq :: a -> a -> Bool
> eq x y = teq rep x y
> example_fixed =
> eq True False &&
> eq ((3::Int), ('a', True)) (4, ('a', True))
A Pause to Reflect
------------------
Now we get to two important questions:
1) What types *can* we reify the structure of?
2) What functions can we write using type-structure?
Representing the structure of types
-----------------------------------
Ok, so what types *can* we reify the structure of? What about
parameterized types, like 'Maybe'?
~~~~~~~~~~~~~{.haskell}
data Maybe a = Nothing | Just a
~~~~~~~~~~~~~
Here is another example of a type isomorphism:
> maybeIso :: Iso (Maybe a) (Either () a)
> maybeIso = Iso t f where
> t :: Maybe a -> Either () a
> t Nothing = Left ()
> t (Just x) = Right x
> f :: Either () a -> Maybe a
> f (Left ()) = Nothing
> f (Right x) = Just x
Note that even though `Maybe` takes a type argument we can still
create a representation of it. All we need is to supply a
representation of that argument. To make our job easier, we can get
that representation from the type class.
> tMaybe :: Rep a => Type (Maybe a)
> tMaybe = TIso maybeIso rep
Don't forget to add this meta-information to the Rep class.
> instance Rep a => Rep (Maybe a) where rep = tMaybe
> example_maybe =
> eq (Just True) (Just True)
Now, it turns out that we can reify almost *any* datatype using type
isomorphisms. It all boils down to viewing datatypes as
"sums-of-products".
In other words, whenever we see a datatype definition, such as
> data A = B | C Char | D Int Bool Char
A ~= Either () (Either Char (Prod (Prod Int Bool) Char))
We think of this type as isomorphic to (where a + b is
shorthand for Either a b)
(() + Char) + (Int, (Bool, Char))
Note that we use nested products for data constructors that take more
than one argument, nested Eithers for data types with more than two
constructors and `()` for data constructors that do not take *any*
arguments.
> aIso :: Iso A (Either (Either () Char) (Int, (Bool, Char)))
> aIso = Iso t f where
> t B = Left (Left ())
> t (C c) = Left (Right c)
> t (D i b c) = Right (i,(b,c))
> f (Left (Left ())) = B
> f (Left (Right c)) = C c
> f (Right (i,(b,c))) = D i b c
> aType :: Type A
> aType = TIso aIso rep
What about recursive types? We can also represent their metadata
using *recursive values*, taking advantage of the fact that Haskell
is a lazy language.
For example, we can represent lists:
Bool ~= Bool
List a ~= Either () (a, List a)
> instance Rep a => Rep [a] where rep = tList
> tList :: Rep a => Type [a]
> tList = TIso (Iso t f) rep where
> t :: [a] -> Either () (a, [ a ])
> t [] = Left ()
> t (x :xs) = Right (x , xs)
> f :: Either () (a, [a]) -> [ a ]
> f (Left ()) = []
> f (Right (x,xs)) = (x : xs)
And trees. In fact, the TH code in the module [Types](Types.hs)
automatically generates declarations such as these.
More operations
---------------
What other functions operate based on the structure of types? What
about a *binary* version of the read and show functions.
> data Bit = O | I deriving (Eq, Show)
> class Rep a => Marshall a where
> marshall :: a -> [ Bit ]
> unmarshall :: [ Bit ] -> a
> prop_m1 :: AnyGType -> Property
> prop_m1 (AnyG (t1 :: Type a)) = forAll arbitrary $ \ x ->
> tunmarshall t1 (tmarshall t1 x) == x
First, some helper functions for the base types:
> intSize :: Int
> intSize = bitSize (0 :: Int)
> serializeInt :: Int -> Int -> [ Bit ]
> serializeInt 0 x = []
> serializeInt n x =
> (if x `mod` 2 == 0 then O else I ) :
> serializeInt (n - 1) (x `div` 2)
> marshallInt :: Int -> Int -> [ Bit ] -> [ Bit ]
> marshallInt n x bs = serializeInt n x ++ bs
> unmarshallInt :: Int -> State [ Bit ] Int
> unmarshallInt 0 = return 0
> unmarshallInt n = do
> bs' <- get
> case bs' of
> (O : bs) ->
> put bs >> liftM (* 2) (unmarshallInt (n - 1))
> (I : bs) ->
> put bs >> unmarshallInt (n - 1) >>= \x -> return (x * 2 + 1)
> [] -> error "read error"
>
> prop_Int i = j == i where
> (j, _) = runState (unmarshallInt intSize)
> (marshallInt intSize i [])
Now we can define the general structure-based operations:
> instance Rep a => Marshall a where
> marshall = tmarshall rep
> unmarshall = tunmarshall rep
> tmarshall :: Type a -> a -> [Bit]
> tmarshall t x = aux t x [] where
> aux :: Type a -> a -> [Bit] -> [Bit]
> aux = undefined
> tunmarshall :: Type a -> [ Bit ] -> a
> tunmarshall t bs = fst (runState (aux t) bs) where
> aux :: Type a -> State [Bit] a
> aux = undefined
-----------------------------------------------------
Below, some helper definitions for testing. Arbitrary
type information generators....
> data AnyType = forall a. Any (Type a)
> data AnyGType = forall a.
> (Show a, Eq a, Arbitrary a) => AnyG (Type a)
> instance Show AnyGType where
> show (AnyG t) = show (Any t)
> instance Show AnyType where
> show (Any TInt) = "TInt"
> show (Any TChar) = "TChar"
> show (Any TUnit) = "TUnit"
> show (Any (TEither t1 t2)) = "(TEither " ++ show (Any t1) ++ " "
> ++ show (Any t2) ++ ")"
> show (Any (TProd t1 t2)) = "(TProd " ++ show (Any t1) ++ " "
> ++ show (Any t2) ++ ")"
> instance Arbitrary Bit where
> arbitrary = elements [ O, I ]
> instance Arbitrary AnyGType where
> arbitrary = sized anyn where
> base = elements [ AnyG TInt, AnyG TChar, AnyG TUnit ]
> anyn 0 = base
> anyn n = oneof [ base,
> do a1 <- anyn (n `div` 2)
> a2 <- anyn (n `div` 2)
> case (a1,a2) of
> (AnyG t1, AnyG t2) ->
> return $ AnyG (TProd t1 t2),
> do a1 <- anyn (n `div` 2)
> a2 <- anyn (n `div` 2)
> case (a1,a2) of
> (AnyG t1, AnyG t2) ->
> return $ AnyG (TEither t1 t2) ]