A data structure for Logical Expressions in Haskell

Question

I try to create a data structure for working with logical expressions. At first glance, the logical expressions look like Trees, so it seems reasonable to make up it from trees:

data Property a = And (Property a) (Property a) |
                 Or  (Property a) (Property a) |
                 Not (Property a) | Property a   deriving (Show,Eq)

But it isn't a good idea, because really there is no difference between left and right branches of my tree, since A && B == B && A

Well, maybe List is even better?

data Property a = Empty | And a (Property a) | Or a (Property a) | Not a (Property a)  deriving Show

But in this case, I can't really form a 'logic tree', only a 'logic list '. So I need a data structure similar to Tree but without left and right "branching".

I recommend the naive representation (your first one), plus a function `canonicalize :: Property a -> Property a` that [picks one of the famous normal forms](http://en.wikipedia.org/wiki/Canonical_normal_form). — Daniel Wagner, Feb 02 '15 at 21:07
Another choice, with its own merits and demerits, is to jump straight to the denotation: `type Property a = (a -> Bool) -> Bool`. Whether that's reasonable depends a lot on what operations you want to be able to perform on a `Property a`... can you elaborate on that, please? — Daniel Wagner, Feb 02 '15 at 21:10
I work on query to query matching, so my first idea was to build logical trees, canonicalize it by any way and match two trees. — Sergey Sosnin, Feb 02 '15 at 21:16

Cirdec · Accepted Answer · 2015-02-03T13:36:14.123

We're going to follow Daniel Wagner's excellent suggestion and use the "naive representation (your first one), plus a function that picks one of the famous normal forms". We are going to use algebraic normal form for two reasons. The main reason is that algebraic normal form doesn't require enumerating all the possible values of the variable type held in a Property. Algebraic normal form is also fairly simple.

Algebraic Normal Form

We'll represent algebraic normal form with a newtype ANF a = ANF [[a]] that doesn't export its constructor. Each inner list is a conjunction (and-ing) of all of its predicates; if an inner list is empty it is always true. The outer list is an exclusive or-ing of all of its predicates; if it is empty it is false.

module Logic (
    ANF (getANF),
    anf,
    ...
)

newtype ANF a = ANF {getANF :: [[a]]}
  deriving (Eq, Ord, Show)

We will make efforts so that any ANF we construct will be canonical. We will build ANFs so that the inner list is always sorted and unique. The outer list will also always be sorted. If there are two (or an even number of) identical items in the outer list we will remove them both. If there are an odd number of identical items in the outer list, we will remove all but one of them.

Using functions from Data.List.Ordered in the data-ordlist package, we can clean up a list of lists representing the xor-ing of conjunctions into an ANF with the lists sorted and duplicates removed.

import qualified Data.List.Ordered as Ordered

anf :: Ord a => [[a]] -> ANF a
anf = ANF . nubPairs . map Ordered.nubSort

nubPairs :: Ord a => [a] -> [a]
nubPairs = removePairs . Ordered.sort
    where
        removePairs (x:y:zs)
            | x == y    = removePairs zs
            | otherwise = x:removePairs (y:zs)
        removePairs xs = xs

Logical expressions form a boolean algebra, which is a bounded lattice with an additional distributive law and a complement (negation). Taking advantage of the existing BoundedLattice class in the lattices package, we can define the class of BooleanAlgebras

import Algebra.Lattice

class (BoundedLattice a) => BooleanAlgebra a where
    complement :: a -> a
    -- Additional Laws:
    -- a `join` complement a == top
    -- a `meet` complement a == bottom
    -- a `join` (b `meet` c) == (a `join` b) `meet` (a `join` c)
    -- a `meet` (b `join` c) == (a `meet` b) `join` (a `meet` c)

ANF as form a BooleanAlgebra whenever a has an Ord instance so that we can keep the ANF in canonical form.

The meet of a boolean algebra is logical and. Logical and distributes across xor. We will take advantage of the fact that the inner lists are already sorted to quickly merge them together distinctly.

instance (Ord a) => MeetSemiLattice (ANF a) where
    ANF xs `meet` ANF ys = ANF (nubPairs [Ordered.union x y | x <- xs, y <- ys])

Using De Morgan's laws, the join or logical or can be written in terms of the meet or logical and.

instance (Ord a) => JoinSemiLattice (ANF a) where
    xs `join` ys = complement (complement xs `meet` complement ys)

The top of the lattice is always true.

instance (Ord a) => BoundedMeetSemiLattice (ANF a) where
    top = ANF [[]]

The bottom of the lattice is always false.

instance (Ord a) => BoundedJoinSemiLattice (ANF a) where
    bottom = ANF []

Logical and and logical or satisfy the lattice absorption law, a join (a meet b) == a meet (a join b) == a.

instance (Ord a) => Lattice (ANF a)
instance (Ord a) => BoundedLattice (ANF a)

Finally the complement operation is negation, which can be accomplished by xoring by true.

instance (Ord a) => BooleanAlgebra (ANF a) where
    complement (ANF ([]:xs)) = ANF xs
    complement (ANF xs)      = ANF ([]:xs)

Along with Pointed we can use BooleanAlgebra to define the class of things that hold logical expressions, Logical.

import Data.Pointed

class Logical f where
    toLogic :: (Pointed g, BooleanAlgebra (g a)) => f a -> g a

Algebraic normal form holds a logical expression.

xor :: BooleanAlgebra a => a -> a -> a
p `xor` q = (p `join` q) `meet` complement (p `meet` q)

instance Logical ANF where
    toLogic = foldr xor bottom . map (foldr meet top . map point) . getANF

Algebraic normal form is also Pointed, and thus can be converted to using toLogic.

instance Pointed ANF where
    point x = ANF [[x]]

toANF :: (Logical f, Ord a) => f a -> ANF a
toANF = toLogic

We can test if anything Logical is logically equivalent by comparing to see if it is structurally equivalent when converted to canonical algebraic normal form.

equivalent :: (Logical f, Logical g, Ord a) => f a -> g a -> Bool
equivalent a b = toANF a == toANF b

Converting Property to ANF

Logical expressions should form a boolean algebra, which is a bounded lattice with an additional distributive law and a complement (negation). To make Property more closely parallel a boolean algebra, we need to add two more elements for the top and bottom bounds of the lattice. top is always True, and bottom is always False. I'm going to call these Trivial for properties that are always True and Impossible for properties that are always False.

data Property a
    = And (Property a) (Property a)
    | Or  (Property a) (Property a)
    | Not (Property a)
    | Property a
    | Trivial
    | Impossible
  deriving (Eq, Ord, Read, Show)

Property is an abstract syntax tree for a property. Its derived Eq and Ord instances are only structural equality.

A Property is Logical, and we can convert it into any Pointed BooleanAlgebra.

instance Logical Property where
    toLogic (And a b)    = (toLogic a) `meet` (toLogic b)
    toLogic (Or  a b)    = (toLogic a) `join` (toLogic b)
    toLogic (Not a)      = complement (toLogic a)
    toLogic (Property a) = point a
    toLogic Trivial      = top
    toLogic Impossible   = bottom

Using equivalent from the previous section and our Logical instance, we can check to see if two properties are equivalent for some examples.

runExample :: (Ord a, Show a) => Property a -> Property a -> IO ()
runExample p q = 
    if p `equivalent` q
    then putStrLn (show p ++ " == " ++ show q)
    else putStrLn (show p ++ " /= " ++ show q)

main = do
    runExample (point 'a' `meet` point 'b') (point 'b' `meet` point 'a')
    runExample (point 'a' `meet` point 'b') (point 'b' `meet` point 'a' `meet` point 'a')
    runExample (point 'a' `meet` point 'b') (point 'b' `meet` point 'a' `join` point 'a')
    runExample (point 'a' `join` complement (point 'a')) (top)
    runExample (point 'a' `meet` complement (point 'a')) (bottom)

This produces the following output.

And (Property 'a') (Property 'b') == And (Property 'b') (Property 'a')
And (Property 'a') (Property 'b') == And (And (Property 'b') (Property 'a')) (Pr
operty 'a')
And (Property 'a') (Property 'b') /= Or (And (Property 'b') (Property 'a')) (Pro
perty 'a')
Or (Property 'a') (Not (Property 'a')) == Trivial
And (Property 'a') (Not (Property 'a')) == Impossible

To use these examples as written, we also need BooleanAlgebra and Pointed instances for Property. The equivalence relation for the BooleanAlgebra laws is equivalent interpretations rather than the structural equality of Property.

instance MeetSemiLattice (Property a) where
    meet = And

instance BoundedMeetSemiLattice (Property a) where
    top = Trivial

instance JoinSemiLattice (Property a) where
    join = Or

instance BoundedJoinSemiLattice (Property a) where
    bottom = Impossible

instance Lattice (Property a)
instance BoundedLattice (Property a)

instance BooleanAlgebra (Property a) where
    complement = Not

instance Pointed Property where
    point = Property

Proof

Every Boolean function, and thus every finite Property, has a unique representation in ANF. The BooleanAlgebra and Pointed instances for ANF demonstrated that everything Logical a, and thus every finite Property a and Boolean function indexed by a, has a representation in ANF a. Let k be the number of inhabitants of a. There are 2^(2^k) possible Boolean functions of k Boolean variables. Each inner list of an ANF is a set of as; there are 2^k possible sets of as and thus 2^k possible inner lists. The outer list of an ANF is a set of inner lists; each inner list can occur at most once in the outer list. There are 2^(2^k) possible ANF as. Since every Boolean function has a representation in ANF and there are only as many possible inhabitants of ANF as there are Boolean functions, each Boolean function has a unique representation in ANF.

Disclaimer

The Ord instance for ANF is a structural order and, except for equality, has nothing to do with the partial orders induced by the lattice structure.

Algebraic normal form can be exponentially larger than its input. The disjunction of a list of n variables has size .5*n*2^n. For example, length . getANF . foldr join bottom . map point $ ['a'..'g'] is 127 and foldr join bottom . map point $ ['a'..'g'] contains a total of 448 occurrences of the 7 different variables.

re: "you should convince yourself...", the Wikipedia page you linked says, in part, "ANF is a normal form, which means that two equivalent formulas will convert to the same ANF". — Daniel Wagner, Feb 03 '15 at 03:00
@DanielWagner A simple assertion, much less a simple uncited assertion in a Wikipedia article, is not enough to convince me. I have, so-far, failed to convince myself. Other Wikipedia articles on "normal" forms make similar claims, despite trivial counter examples. Both `(a ∧ ¬b) ∨ (a ∧ b)` and `a` are in "disjunctive normal form", and both of them mean the same thing. Disjunctive normal form as described on Wikipedia isn't really a normal form. Only specific variants of it, such as full disjunctive normal form where every variable appears in every conjunction, are normal forms. — Cirdec, Feb 03 '15 at 03:16
@Cirdec While I appreciate your skepticism, your criticism of Wikipedia looks unwarranted to me; the Wikipedia page for "Disjunctive normal form" is quite clear that only full disjunctive normal form is canonical, and the Wikipedia page for "Canonical normal form" looks likewise quite careful about what forms have the "equivalent formulas have the same form" property. — Daniel Wagner, Feb 03 '15 at 06:16
@phadej The question is explicitly about testing for tautological equivalence, starting with a structure very close to "negation normal form". Testing for tautological equivalence is necessarily NP hard (testing if `f == const False` is the (un)satisfiability problem), but doesn't necessarily require exponential space. — Cirdec, Feb 03 '15 at 14:13

score 1 · Answer 2 · edited May 23 '17 at 11:50

1

I'd recommend using a SAT/SMT solver for the equivalence check. In general, these sorts of checks can be very expensive (NP-complete), and any sort of translation to normal-forms can cause exponential blow-up in representation. SAT/SMT solvers have custom algorithms for dealing with such problems, and it might be best to use such a solver. It would be trivial to translate two instances of Property and ask if they are the same under all variable assignments. SBV library (https://hackage.haskell.org/package/sbv) can be used to do the translation from high-level Haskell. The answer to the following question has some details on how to go about doing that: SAT solving with haskell SBV library: how to generate a predicate from a parsed string?

edited May 23 '17 at 11:50

Community

1
1

answered Feb 04 '15 at 00:34

alias

28,120
2
23
40

There's no reason to restrict the problem to `Property Bool`. The type argument to `Property` isn't the type of the result, it's the type of the index for the variables. As long as it has an `Ord` or `Eq` instance, it can be converted to another index like `Int`s in `O(n*log n)` or `O(n^2)` time. – Cirdec Feb 04 '15 at 01:00
Ah, quite right. If the 'a' parameter is just indexing into variables, then indeed it would be easy to translate this to a SAT/SMT problem, and use an off-the-shelf solver to do the equivalence checking. Also note that the backend solver will return a counter-example model if the two formulas are not equivalent, which can then be used further. i.e.; the "not-equal" answer will be substantiated with a model that shows why the equivalence fails instead of a plain "no" answer. – alias Feb 04 '15 at 01:23

score 0 · Answer 3 · answered Feb 03 '15 at 06:41

If you want to reflect the associativity of logical connectives and (&&) and or, use a data structure that is associative, like list:

data Property a = And [Property a] | ...

If you also want commutativity (A && B == B && A), go with Data.Set.

I remember this handy chart for which types to use if you want certain properties:

+-----------+-----------+----------+----------
|Associative|Commutative|Idempotent|   type
+-----------+-----------+----------+----------
|(a+b)+c=   |  a+b=b+a  |  a+a=a   |
|    a+(b+c)|           |          |
+-----------+-----------+----------+----------
|    no     |    no     |    no    |binary trees
|    no     |   yes     |    no    | “mobiles”
|   yes     |    no     |    no    |lists (arrays)
|   yes     |   yes     |    no    |multisets (bags)
|   yes     |   yes     |   yes    |sets
+-----------+-----------+----------+----------

Page 51 of Guy Steele's Lecture.

And which data structure reflects deMorgan's laws...? =P – Daniel Wagner Feb 03 '15 at 08:02 — Daniel Wagner, Feb 03 '15 at 08:02

A data structure for Logical Expressions in Haskell

3 Answers3

Algebraic Normal Form

Converting Property to ANF

Proof

Disclaimer