HW 3 - SAT Solving with the DPLL algorithm
===================================
> -- Advanced Programming, HW 3
> -- answer by Stephanie Weirich
> {-# LANGUAGE FlexibleInstances #-}
The [Davis-Putnam-Logemann-Loveland
algorithm](http://en.wikipedia.org/wiki/DPLL_algorithm) is an
algorithm for deciding the satisfiablility of propositional logic
formulae. Although the SAT problem is NP-complete, it is stxill
remarkably amenable to automation, and modern high-performance
SAT-solvers are regularly used in software verification, constraint
solving and optimization.
Your task is to implement the DPLL algorithm and verify its
correctness using QuickCheck. To complete the homework, you should
edit the file [Sat.hs](Sat.hs) and submit it via the [course web
page](https://alliance.seas.upenn.edu/~cis552/cgi-bin/submit.php).
> module Sat where
> import Data.Map (Map) -- use finite map data structure from standard library
> import qualified Data.Map as Map
>
>
> import Data.List (nub)
> import Control.Monad (liftM,liftM2,MonadPlus(..)) -- can import more if necessary
> import Data.Boolean.SatSolver
>
> import Test.QuickCheck
If you don't have the QuickCheck library installed on your machine,
don't forget to
cabal install quickcheck
The DPLL algorithm works on formulae that are in [Conjunctive Normal
Form](http://en.wikipedia.org/wiki/Conjunctive_normal_form),
i.e. formulae that are a conjunction of clauses, where each clause is a
disjunction of literals, i.e. positive or negative propositional
variables.
> -- | An expression in CNF, the conjunction of clauses
> newtype CNF = CNF [ Clause ] deriving (Eq, Ord, Show)
> unCNF :: CNF -> [ Clause ]
> unCNF (CNF cs) = cs
> -- | A clause -- the disjunction of a number of literals
> type Clause = [ Lit ]
> -- | A literal, either a positive or negative variable
> data Lit = Lit Bool Var deriving (Eq, Ord, Show)
> -- | A variable
> data Var = A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | AA | AB | AC | AD | AE| AF | AG | AH | AI | AJ | AK | AL | AM | AN | AO | AP | AQ | AR | AS | AT | AU | AV | AW | AX | AY | AZ
> deriving (Read, Eq, Ord, Show, Enum, Bounded)
> -- | invert the polarity of a literal
> invert :: Lit -> Lit
> invert (Lit b x) = Lit (not b) x
> -- | Get the name of the variable
> var :: Lit -> Var
> var (Lit _ x) = x
> -- | Is the literal positive?
> isPos :: Lit -> Bool
> isPos (Lit b _) = b
For example, we can represent the formula `A ^ (B v ~A)` by:
> exampleFormula :: CNF
> exampleFormula = CNF [[Lit True A],[Lit True B,Lit False A]]
This formula is satisfiable as long as `A` is assigned to `True`.
> exampleAssignment :: Map Var Bool
> exampleAssignment = Map.fromList [(A, True), (B, True)]
Given an assignment for variables, we can determine the truth
value of literals. Unassigned variables are assumed to be true.
> interpLit :: Map Var Bool -> Lit -> Bool
> interpLit m (Lit b k) = maybe b (if b then id else not) (Map.lookup k m)
We extend this to an interpretation of CNF formulae by ensuring that
in all conjunctions, there is at least one literal that is true.
> interp :: Map Var Bool -> CNF -> Bool
> interp m (CNF formula) = all (any (interpLit m)) formula
> solved :: Solver -> Bool
> solved (Solver cnf m) = interp m (CNF cnf)
> prop_interpExample :: Bool
> prop_interpExample = interp exampleAssignment exampleFormula
The DPLL algorithm takes a formula in conjunctive normal form and
produces an assignment for the variables if the formula is
satisfiable. Your job is to complete this implementation based on
the description of the algorithm on the Wikipedia page. Note that
the pseudocode given is fairly imperative. You should think about how
to define a *functional* version of the algorithm, based on what we
have learned in this course so far.
> dpll :: CNF -> Maybe (Map Var Bool)
> dpll (CNF cnf) = loop initial where
> initial :: Solver
> initial = (Solver (removeTrivial (map nub cnf)) Map.empty)
>
> loop :: Solver -> Maybe (Map Var Bool)
> loop s =
> let s1 = unitPropagate s in
> let s2 = pureLitAssign s1 in
> let cs = clauses s2 in
> if solved s2
> then return $ assignment s2
> else if any null cs
> then mzero
> else
> let l = chooseLit cs in
> loop (addClause [l] s2) `mplus`
> (loop (addClause [invert l] s2))
>
> addClause :: Clause -> Solver -> Solver
> addClause c (Solver cs m) = Solver (c : cs) m
> -- remove any clauses with both A and not A, as they are guaranteed to be
> -- true. Only do this for the initial formula, as the algorithm will never
> -- introduce any new clauses with this property
> removeTrivial :: [Clause] -> [Clause]
> removeTrivial cs =
> (filter (\c -> null [ () | l1 <- c, l2 <- c, l1 == invert l2]) cs)
As a hint, you can think of the main part of the dpll function as
executing a loop, where each recursive call of the loop takes as argument the
current 'state' of the solver --- the clauses and the current variable
assignment.
We can represent this solver state with the following datatype. The
record labels give us convenient access to the two components.
> data Solver = Solver { clauses :: [Clause] , assignment :: (Map Var Bool) }
> deriving (Eq, Ord, Show)
Part of the action of the loop will be to update that solver state
using algorithms for *Unit propagation* and *Pure literal elimination*.
> -- | If a clause is a unit clause, i.e. it contains only a single
> -- unassigned literal, this clause can only be satisfied by assigning
> -- the necessary value to make this literal true. Unit propagation updates
> -- the map with the assignment and simplifies the clauses to reflect this
> -- reasoning.
> unitPropagate :: Solver -> Solver
> unitPropagate s =
> if null unitLits then s else unitPropagate (foldr setTrue s unitLits)
> where unitLits = [ l | [ l ] <- clauses s ]
>
> -- | If a propositional variable occurs with only one polarity in the
> -- formula, it is called pure. Pure literals can always be assigned in
> -- a way that makes all clauses containing them true. Thus, these
> -- clauses do not constrain the search anymore and can be deleted.
> pureLitAssign :: Solver -> Solver
> pureLitAssign s =
> if null pureLits then s else pureLitAssign (foldr setTrue s pureLits)
> where pureLits = nub [ lit | clause <- clauses s,
> lit <- clause,
> not (any ((invert lit) `elem`) $ clauses s) ]
>
>
> setTrue :: Lit -> Solver -> Solver
> setTrue literal (Solver cs m) = (Solver cs'' m') where
> -- remove clauses containing the positive literal (they are trivially satisfied)
> cs' = filter (not . (literal `elem`)) cs
> -- remove negative literal from all other clauses
> -- (other lits must satisfy the clause)
> cs'' = map (filter ((invert literal)/=)) cs'
> -- record the assignment
> m' = Map.insert (var literal) (isPos literal) m
>
You will know that you have a correct version when the following
property is satisfied by quick check. Note: You should instrument
this property (using collect/classify) to make sure that you are
testing your solution with meaningful tests.
Remember, though, that even if your assignment passes all of the test
cases you should go back and think about code style. Can you improve
the assignment? Can you make the code clearer in some way?
> prop_dpll :: CNF -> Property
> prop_dpll c =
> case dpll c of
> Just m -> classify True "sat" $ property (interp m c)
> Nothing -> classify True "unsat" $ property True
To run quickcheck you will need both the CNF and Solver types to be
members of the Arbitrary type class. You may find it useful to add definitions
for shrink as well.
> instance Arbitrary CNF where
> arbitrary = fmap CNF arbitrary
> shrink (CNF cs) = map CNF (shrink cs)
> instance Arbitrary Var where
> arbitrary = elements [A .. ]
> instance Arbitrary Lit where
> arbitrary = liftM2 Lit arbitrary arbitrary
> instance Arbitrary Solver where
> arbitrary = liftM simplify (liftM2 Solver arbitrary arbMap) where
> arbMap = fmap Map.fromList (listOf (liftM2 (,) arbitrary arbitrary))
> simplify (Solver cs m) = foldr (\(x,b) -> setTrue (Lit b x))
> (Solver (map nub cs) m)
> (Map.assocs m)
Note that you will need more tests / property checks than prop_dpll
above. In particular, the properties for unitPropagate and
pureLitAssign do not completely characterize those steps----it is
possible for buggy implementations to satisfy those properties. You'll
need to come up with some additional tests/properties to help you
debug your program if it does not satisfy prop_dpll.
> -- | Pick a literal to split on.
> chooseLit :: [Clause] -> Lit
> chooseLit ((lit:_) : _) = lit
> chooseLit _ = error "BUG: should be nonempty"
>
> -- Generating hard test cases
> -- r : number of clauses
> -- m : number of variables
> -- hard cases are in the range r ~= 4.26 * m
> hard3SatGen :: Int -> Int -> Gen CNF
> hard3SatGen r m = liftM CNF (vectorOf r arbcls) where
> arbcls :: Gen Clause
> arbcls = vectorOf 3 (liftM2 Lit arbitrary (elements [A .. toEnum m]))
Checking against reference implementation
-----------------------------------------
> fromLit :: Lit -> Boolean
> fromLit (Lit True x) = Var (fromEnum x)
> fromLit (Lit False x) = Not (Var (fromEnum x))
> toBoolean :: CNF -> Boolean
> toBoolean (CNF cs) = foldr (\xs y -> (foldr (\ l z -> fromLit l :||: z) No xs) :&&: y) Yes cs
> prop_eqSatSolver :: CNF -> Bool
> prop_eqSatSolver cnf =
> case (assertTrue (toBoolean cnf) newSatSolver) of
> Just ss -> case (isSolvable ss, dpll cnf) of
> (True, Just m) -> solved (Solver (unCNF cnf) m)
> (False, Nothing) -> True
> (_,_) -> False
> Nothing -> case dpll cnf of
> Just m -> not (solved (Solver (unCNF cnf) m))
> Nothing -> True
> main :: IO ()
> main = do
> -- quickCheck prop_dpll
> quickCheck $ forAll (hard3SatGen 86 20) prop_dpll
> -- quickCheck prop_eqSatSolver
> -- quickCheck $ forAll (hard3SatGen 86 20) prop_eqSatSolver
A monadic implementation
------------------------
> data SatM a = SatM (Solver -> Maybe (a, Solver))
> instance Monad SatM where
> return _ = undefined
> _ >>= _ = undefined
> instance MonadPlus SatM where
> mzero = undefined
> mplus = undefined
> get :: SatM Solver
> get = undefined
> put :: Solver -> SatM ()
> put = undefined
> runSat :: SatM a -> Solver -> Maybe a
> runSat = undefined