# Functors and Monads

```
-- Advanced Programming, HW 3
-- by <YOUR NAME HERE> <pennid>
-- (and <YOUR PARTNERS NAME> <pennid>)
```

```
{-# OPTIONS -Wall -fno-warn-unused-binds -fno-warn-unused-matches -fwarn-tabs -fno-warn-type-defaults #-}
{-# LANGUAGE NoImplicitPrelude, FlexibleInstances #-}
```

To complete this homework, download the nonliterate version of the file and answer each question, filling in code where noted. Your code must typecheck against the given type signatures. Also remember to add your own tests to this file to exercise the functions you write, and make your own 'main' function to run those tests.

`module Main where`

```
import Prelude hiding (mapM,any,all,filter)
import Data.Map (Map)
import qualified Data.Map as Map
import Control.Monad.State hiding (when, mapM, foldM, (>=>), sequence, join)
import Test.HUnit hiding (State)
```

```
main :: IO ()
main = return ()
```

# General Monadic Functions

This problem asks you to recreate some of the operations in the Control.Monad library. You should not use any of the functions defined in that library to solve this problem, though you can play with those functions to develop test cases.

`-- Problem 0`

`-- (a) Define a monadic generalization of map `

```
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
mapM = error "mapM: unimplemented"
```

`-- (b) Define a monadic generalization of foldl`

```
foldM :: Monad m => (a -> b -> m a) -> a -> [b] -> m a
foldM = error "foldM: unimplemented"
```

`-- (c) `

```
-- | Evaluate each action in the sequence from left to right,
-- and collect the results.
sequence :: Monad m => [m a] -> m [a]
sequence = error "sequence: unimplemented"
```

`-- (d)`

```
-- | The fish operator, a form of composition
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> (a -> m c)
(>=>) = error ">=>: unimplemented"
```

For more information about this operator, see the explanation at the bottom of this page.

`-- (e)`

```
-- | The 'join' function removes one level of monadic structure,
-- projecting its bound argument into the outer level.
join :: (Monad m) => m (m a) -> m a
join = error "join: unimplemented"
```

# Append Lists

`-- Problem 1`

The following parameterized data structure, called an append list

`data AList a = Nil | Single a | Append (AList a) (AList a) deriving (Show, Eq)`

is a list optimized for concatenation. Indeed, two such lists can be appended in constant time, using the `Append`

constructor.

`-- (a)`

Like lists, this type can be made an instance of the `Functor`

and `Monad`

type classes.

```
instance Functor AList where
fmap _ _ = error "AList fmap: unimplemented"
```

```
instance Monad AList where
return = error "AList return: unimplemented"
_ >>= _ = error "AList bind: unimplemented"
```

However, how do you know that your Functor and Monad instances are correct? Type classes often come with *laws* that govern their correct usage. For example, all implementations of (==) should be reflexive, symmetric and transitive. Instances that do not follow these laws are confusing and unpredictable, leading to buggy programs.

Functor instances should satisfy the two *laws* shown below.

The first law states that mapping the identity function shouldn't do anything. [We're deliberately breaking the CamelCase naming convention here to emphasize that these definitions are *properties*.]

```
prop_FMapId :: (Eq (f a), Functor f) => f a -> Bool
prop_FMapId x = fmap id x == id x
```

The second law allows us to combine two passes with fmap into a single one using function composition.

```
prop_FMapComp :: (Eq (f c), Functor f) => (b -> c) -> (a -> b) -> f a -> Bool
prop_FMapComp f g x =
fmap (f . g) x == (fmap f . fmap g) x
```

Likewise, monad instances should satisfy the *three* monad laws, expressed below. These laws are stated generally, for any monad.

```
prop_LeftUnit :: (Eq (m b), Monad m) => a -> (a -> m b) -> Bool
prop_LeftUnit x f =
(return x >>= f) == f x
```

```
prop_RightUnit :: (Eq (m b), Monad m) => m b -> Bool
prop_RightUnit m =
(m >>= return) == m
```

```
prop_Assoc :: (Eq (m c), Monad m) =>
m a -> (a -> m b) -> (b -> m c) -> Bool
prop_Assoc m f g =
((m >>= f) >>= g) == (m >>= \x -> f x >>= g)
```

Finally, types that are instances of both Functor and Monad should satisfy this law:

```
prop_FunctorMonad :: (Eq (m b), Functor m, Monad m) => m a -> (a -> b) -> Bool
prop_FunctorMonad x f = (fmap f x) == (x >>= return . f)
```

Define test cases for `AList`

that demonstrates that your instances above satisfy the functor and monad laws. (You may find it useful to use HUnit's `(~?)`

operator to coerce a boolean computation into a test case.)

```
testAListFunctor :: Test
testAListFunctor = TestList [
]
```

```
testAListMonad :: Test
testAListMonad = TestList [
]
```

```
testAListFunctorMonad :: Test
testAListFunctorMonad = TestList [
]
```

`-- (b)`

Now think about instances of `Functor`

and `Monad`

for `AList`

that do not satisfy the laws above. For example, if we merely left all of the methods undefined, then the test cases would fail.

```
{- Invalid instance of Functor and Monad:
instance Functor AList where
fmap f s = undefined
instance Monad AList where
return = undefined
(>>=) = undefined
-}
```

Are there other invalid instances? Add at least one instance below (in comments) that does *not* use `undefined`

or `error`

, and does not include an infinite loop. Your instance(s) should type check, but should fail at least one test above. Which properties are violated?

`-- (c)`

Append lists trade constant time "head" access for constant time concatenation. Accessing the first element in an append list may take time linear to the length of the list. That is the tradeoff for this datatype.

Nevertheless, it is possible to write a concise function to find this first element by taking advantage of the fact that `Maybe`

is a member of the `MonadPlus`

type class. (Accessing the last element is similar.) Read this wikipage for more information and write the `first`

and `final`

operations as succinctly as you can.

```
-- | access the first element of the list, if there is one.
first :: AList a -> Maybe a
first = error "first: unimplemented"
```

```
-- | access the last element of the list, if there is one
final :: AList a -> Maybe a
final = error "final: unimplemented"
```

`-- (d)`

ALists are also (trivially) members of the MonadPlus type class.

```
instance MonadPlus AList where
mzero = Nil
mplus = Append
```

Now use the `MonadPlus`

type class to implement a general searching routine for append lists, called `search`

. This searching routine should be general enough so that it can return the first element that satisfies a given predicate (if one exists), all elements that satisfy the predicate, or just filter the append list.

`search = error "search: unimplemented"`

```
any :: (a -> Bool) -> AList a -> Maybe a
any = search
```

```
all :: (a -> Bool) -> AList a -> [a]
all = search
```

```
filter :: (a -> Bool) -> AList a -> AList a
filter = search
```

```
testSearch :: Test
testSearch = TestList [ "search1" ~: any (>3) seq1 ~=? Just 4,
"search2" ~: all (>3) seq1 ~=? [4,5],
"search3" ~: filter (>3) seq1 ~=?
Append (Append Nil Nil)
(Append (Single 4) (Single 5))]
```

```
seq1 :: AList Int
seq1 = Append (Append Nil (Single 3)) (Append (Single 4) (Single 5))
```

`-- (e) `

It is also possible to convert an append list into a normal list.

```
toList :: AList a -> [a]
toList Nil = []
toList (Single x) = [x]
toList (Append s1 s2) = toList s1 ++ toList s2
```

```
showAsList :: Show a => AList a -> String
showAsList = show . toList
```

However, the implementation of toList above is inefficient for left-biased append lists, such as this one.

```
exAList :: AList Int
exAList = Append (Append (Append (Append (Single 0)
(Single 1))
(Single 2))
(Single 3))
(Single 4)
```

In converting such append lists to lists, the append operation will need to traverse the intermediate result lists multiple times. (Recall the implementation of (++) is not constant time---it must loop over its first argument.)

Replace this version of toList with a more efficient, linear time version.

```
testToList :: Test
testToList = toList exAList ~?= [0,1,2,3,4]
```

The AList monad above is isomorphic to a general monad based on tree data structures. What is really cool is that these monads can be used to construct parallel search algorithms using GHC's support for semi-explicit paralellism.

# Regular Expressions

`-- Problem 2`

*Regular expressions* are a specialized language for representing string-matching patterns. Regular expressions were invented by the mathematician Stephen Kleene, one of the pioneers that created the foundations of the theory of computation (with Gödel, Turing, Church, and Post). Kleene invented regular expressions to represent the sets of possible behaviors of the abstract finite computation devices called *finite-state machines*. In Kleene's original formulation, regular expressions were were built from individual letters with three operators: *concatenation*, representing one pattern followed by another; *alternation* (also called union) denoted by `|`

, representing two alternative patterns; and *closure* (also called Kleene-Star), denoted by `*`

, to represent zero or more repetitions of a pattern. By convention, the empty string represents a special regular expression, the *empty* regular expression, which matches the empty string.

For example, the regular expression `a(bc|d)*e`

matches all strings that start with `a`

, then have some number of `bc`

or `d`

characters (possibly none) and end with `e`

. Some such strings include `ae`

, `abce`

, `abcde`

, `adbcde`

, `abcbcdbce`

. In the 1970s, computer scientists at Bell Labs were working on the first software tools for text processing, and they adapted and generalized Kleene's idea in several software tools, starting with grep, for searching and editing text. Regular expressions have many practical uses, mainly in pattern matching, which has applications in everything from compilers to searching databases.

In this problem, we consider regular expression evaluators, that is, programs that determines whether a string is in the language denoted by the regular expression. This process is also called regular expression matching.

We can represent regular expressions using the following datatype in Haskell.

```
data RegExp = Char Char -- single literal character
| Alt RegExp RegExp -- r1 | r2 (alternation)
| Seq RegExp RegExp -- r1 r2 (concatenation)
| Star RegExp -- r* (Kleene star)
| Empty -- ε, accepts empty string
| Void -- ∅, always fails
| Mark RegExp -- (for marked subexpressions, see (b) below)
deriving Show
```

For example, we can define regular expressions for specific character classes:

```
lower, upper, digit, punc, white, anyc :: RegExp
lower = foldr1 Alt (map Char ['a' .. 'z'])
upper = foldr1 Alt (map Char ['A' .. 'Z'])
digit = foldr1 Alt (map Char ['0' .. '9'])
punc = foldr1 Alt (map Char "<>!/.*()?@")
white = foldr1 Alt (map Char " \n\r\t")
```

Or the union of all of the above classes.

`anyc = lower `Alt` upper `Alt` digit `Alt` punc `Alt` white`

Or we can define regular expressions that match specific words.

```
word :: String -> RegExp
word w = foldr Seq Empty (map Char w)
```

```
cis552 :: RegExp
cis552 = word "cis552"
```

The `Star`

operator corresponds to 0 or more occurrences of a pattern. For example, this regular expression accepts any string that begins and ends with the tags `<b>`

and `</b>`

.

```
boldHtml :: RegExp
boldHtml = word "<b>" `Seq` Star anyc `Seq` word "</b>"
```

We can use `Star`

and `Seq`

to define the `plus`

operator, which corresponds to 1 or more occurrences of a pattern.

```
plus :: RegExp -> RegExp
plus pat = pat `Seq` Star pat
```

`-- (a)`

Write a simple (but not particularly efficient) operation that determines whether a particular string is part of the regular language accepted by the given regular expression. (For now, ignore 'Mark's in the input regular expressions.)

Your implementation should use *backtracking* to determine whether the `RegExp`

matches the given input string. For example, to determine whether the concatenation pattern `r1r2`

matches the input string, you need to consider all ways of splitting the string into two parts and see whether `r1`

and r2` match each part.

```
accept :: RegExp -> String -> Bool
accept (Mark r) s = accept r s
accept _ _ = error "accept: finish me"
```

Your implementation should use the following two helper functions. You may find that the list monad is useful for defining and using these functions.

```
-- all decompositions of a string into two different pieces
-- split "abc" == [("","abc"),("a","bc"),("ab","c"),("abc","")]
split :: [a] -> [([a], [a])]
split = error "split: unimplemented"
```

```
-- all decompositions of a string into multi-part (nonempty) pieces
-- parts "abc" = [["abc"],["a","bc"], ["ab","c"], ["a","b","c"]]
parts :: [a] -> [[[a]]]
parts = error "parts: unimplemented"
```

```
testAccept :: Test
testAccept = TestList [
not (accept Void "a") ~? "nothing is void",
not (accept Void "") ~? "really, nothing is void",
accept Empty "" ~? "accept Empty true",
not (accept Empty "a") ~? "not accept Empty",
accept lower "a" ~? "accept lower",
not (accept lower "A") ~? "not accept lower",
accept boldHtml "<b>cis552</b>!</b>" ~? "cis552!",
not (accept boldHtml "<b>cis552</b>!</b") ~? "no trailing" ]
```

`-- (b)`

Backtracking is not the most efficient implementation of regular expressions, but it is the most easily extensible.

One extension is support for marked subexpressions. For this problem, rewrite `accept`

so that it returns all strings that are matched by the marked subexpressions in the regular expression.

```
-- | Mark a subexpression
-- this function just wraps the data constructor now for future
-- extensibility, see (d) below
mark :: RegExp -> RegExp
mark r = Mark r
```

For example, we can mark the part of the regular expression between the tags:

```
boldHtmlPat :: RegExp
boldHtmlPat = word "<b>" `Seq` mark (Star anyc) `Seq` word "</b>"
```

Or mark sequences of letters that correspond to the first and last names:

```
namePat :: RegExp
namePat = mark (plus letter) `Seq` Star white `Seq` mark (plus letter)
where letter = Alt lower upper
```

Or mark any number of sequences of lowercase letters:

```
wordsPat :: RegExp
wordsPat = Star (mark (plus lower) `Seq` Star white)
```

Then, `patAccept`

below returns not only whether the pattern matches, but also the parts of the string that correspond to the marks (in order).

```
testPat :: Test
testPat = TestList [
patAccept boldHtmlPat "<b>cis552" ~?= Nothing,
patAccept boldHtmlPat "<b>cis552!</b>" ~?= Just ["cis552!"],
patAccept boldHtmlPat "<b>cis552</b>!</b>" ~?= Just ["cis552</b>!"],
patAccept namePat "Haskell Curry" ~?= Just ["Haskell", "Curry"],
patAccept wordsPat "a b c d e" ~?= Just ["a", "b", "c", "d", "e"]
]
```

```
patAccept :: RegExp -> String -> Maybe [String]
patAccept = error "patAccept: unimplemented"
```

`-- (c)`

You'll have noticed by now that this implementation of Regular Expression matching is *really* slow. The textbook way to implement regular expression matching is to first translate the regular expression into a finite-state machine and then apply the finite-state matching to the string.

However, there's a more direct, elegant, but not so well-known alternative, the method of *derivatives* due to Janusz A. Brzozowski. This method is described in more detail here.

The basic idea is that given a regular expression and the first character in the input string to match, you can compute a new regular expressions, which must match the remaining string in order for the original `RegExp`

to match the entire input string. This new regular expression is called the *derivative* of the original.

We can use this idea to implement regular expression matching by repeatedly calculating the derivatives for each character in the string. If the final result is a regular expression that accepts the empty string, then the original regular expression would have matched the string. In other words:

```
match :: RegExp -> String -> Bool
match r s = nullable (foldl deriv r s)
```

Your job is to implement `nullable`

and `deriv`

to complete this implementation.

```
-- | `nullable r` return `True` when `r` matches the empty string
nullable :: RegExp -> Bool
nullable _ = error "nullable: unimplemented"
```

```
-- | Takes a regular expression `r` and a character `c`,
-- and computes a new regular expression that accepts word `w` if `cw` is
-- accepted by `r`.
deriv :: RegExp -> Char -> RegExp
deriv = error "deriv: unimplemented"
```

For example, if `r`

is the literal character `c`

, then the derivative of `r`

is `Empty`

, the regular expression that only accepts the empty string. In the case of `Seq`

, you need to think about the case where the first regular expression could accept the empty string. In that case, the derivative should include the possibility that it could be skipped, and the character consumed by the second regexp.

Haskell's lazy evaluation avoids the evaluation of the whole regular expression. The expression has only to be evaluated as much as nullable needs to calculate an answer.

`-- (d) OPTIONAL CHALLENGE`

Extend regular epression derivatives to handle pattern matching. This is not as easy as before. In fact, the only way I know how to do it requires modification of the `Mark`

constructor of the `RegExp`

datatype so that it can store some intermediate information.

You can modify the datatype above to change the 'Mark' data constructor as long as you use the `mark`

function for all of your test cases.

# An Interpreter for WHILE

`-- Problem 3`

In this problem, you will use monads to build an evaluator for a simple imperative language, called WHILE. In this language, we will represent different program variables as

`type Variable = String`

Programs in the language are simply values of the type

```
data Statement =
Assign Variable Expression -- x = e
| If Expression Statement Statement -- if (e) {s1} else {s2}
| While Expression Statement -- while (e) {s}
| Sequence Statement Statement -- s1; s2
| Skip -- no-op
deriving (Eq, Show)
```

where expressions are variables, constants or binary operators applied to sub-expressions

```
data Expression =
Var Variable -- x
| Val Value -- v
| Op Bop Expression Expression
deriving (Eq, Show)
```

and binary operators are simply two-ary functions

```
data Bop =
Plus -- + :: Int -> Int -> Int
| Minus -- - :: Int -> Int -> Int
| Times -- * :: Int -> Int -> Int
| Divide -- / :: Int -> Int -> Int
| Gt -- > :: Int -> Int -> Bool
| Ge -- >= :: Int -> Int -> Bool
| Lt -- < :: Int -> Int -> Bool
| Le -- <= :: Int -> Int -> Bool
deriving (Eq, Show)
```

```
data Value =
IntVal Int
| BoolVal Bool
deriving (Eq, Show)
```

We will represent the *store* i.e. the machine's memory, as an associative map from `Variable`

to `Value`

.

`type Store = Map Variable Value`

**Note:** we don't have exceptions (yet), so if a variable is not found (eg because it is not initialized) simply return the value `0`

. In future assignments, we will add this as a case where exceptions are thrown (the other case being type errors.)

We will use the standard library's `State`

monad to represent the world-transformer. Intuitively, `State s a`

is equivalent to the world-transformer `s -> (a, s)`

. See the above documentation for more details. You can ignore the bits about `StateT`

for now.

If you find the types confusing, the State.lhs module that we developed in class provides the same interface and all of functionality that you need for this assignment. You can replace `import Control.Monad.State`

with this version.

## Expression Evaluator

First, write a function

`evalE :: Expression -> State Store Value`

that takes as input an expression and returns a state-transformer that returns a value. Yes, right now, the transformer doesn't really transform the world, but we will use the monad nevertheless as later, the world may change, when we add exceptions and such.

Again, we don't have any exceptions or typechecking, so the interpretation of an ill-typed binary operation (such as '2 + True') should return always 0.

**Hint:** The value `get`

is of type `State Store Store`

. Thus, to extract the value of the "current store" in a variable `s`

use `s <- get`

.

```
evalE (Var _) = error "evalE: unimplemented"
evalE (Val _) = error "evalE: unimplemented"
evalE (Op _ _ _) = error "evalE: unimplemented"
```

## Statement Evaluator

Next, write a function

`evalS :: Statement -> State Store ()`

that takes as input a statement and returns a world-transformer that returns a unit. Here, the world-transformer should in fact update the input store appropriately with the assignments executed in the course of evaluating the `Statement`

.

**Hint:** The value `put`

is of type `Store -> State Store ()`

. Thus, to "update" the value of the store with the new store `s'`

do `put s`

.

```
evalS (While _ _) = error "evalS: unimplemented"
evalS Skip = error "evalS: unimplemented"
evalS (Sequence _ _ ) = error "evalS: unimplemented"
evalS (Assign _ _) = error "evalS: unimplemented"
evalS (If _ _ _ ) = error "evalS: unimplemented"
```

In the `If`

and `While`

cases, if `e`

evaluates to a non-boolean value, just skip. Finally, write a function

```
execS :: Statement -> Store -> Store
execS = error "execS: unimplemented"
```

such that `execS stmt store`

returns the new `Store`

that results from evaluating the command `stmt`

from the world `store`

. **Hint:** You may want to use the library function

`execState :: State s a -> s -> s`

When you are done with the above, the following function will "run" a statement starting with the `empty`

store (where no variable is initialized). Running the program should print the value of all variables at the end of execution.

```
run :: Statement -> IO ()
run stmt = do putStrLn "Output Store:"
print (execS stmt Map.empty)
```

Here are a few "tests" that you can use to check your implementation. (You do not need to fit these test cases into 80 columns.)

```
wTest :: Statement
wTest = Sequence (Assign "X" (Op Plus (Op Minus (Op Plus (Val (IntVal 1)) (Val (IntVal 2))) (Val (IntVal 3))) (Op Plus (Val (IntVal 1)) (Val (IntVal 3))))) (Sequence (Assign "Y" (Val (IntVal 0))) (While (Op Gt (Var "X") (Val (IntVal 0))) (Sequence (Assign "Y" (Op Plus (Var "Y") (Var "X"))) (Assign "X" (Op Minus (Var "X") (Val (IntVal 1)))))))
```

```
wFact :: Statement
wFact = Sequence (Assign "N" (Val (IntVal 2))) (Sequence (Assign "F" (Val (IntVal 1))) (While (Op Gt (Var "N") (Val (IntVal 0))) (Sequence (Assign "X" (Var "N")) (Sequence (Assign "Z" (Var "F")) (Sequence (While (Op Gt (Var "X") (Val (IntVal 1))) (Sequence (Assign "F" (Op Plus (Var "Z") (Var "F"))) (Assign "X" (Op Minus (Var "X") (Val (IntVal 1)))))) (Assign "N" (Op Minus (Var "N") (Val (IntVal 1)))))))))
```

When you are done, the following tests should pass:

```
t4a :: Test
t4a = execS wTest Map.empty ~?=
Map.fromList [("X",IntVal 0),("Y",IntVal 10)]
```

```
t4b :: Test
t4b = execS wFact Map.empty ~?=
Map.fromList [("F",IntVal 2),("N",IntVal 0),("X",IntVal 1),("Z",IntVal 2)]
```

```
ghci> run w_test
Output Store:
fromList [("X",IntVal 0),("Y",IntVal 10)]
ghci> run w_fact
Output Store:
fromList [("F",IntVal 2),("N",IntVal 0),("X",IntVal 1),("Z",IntVal 2)]
```

Credit: Original version of Problem 3 from UCSD

## News :

Welcome to CIS 552!

See the home page for basic
information about the course, the schedule for the lecture notes
and assignments, the resources for links to the required software
and online references, and the syllabus for detailed information about
the course policies.