If you include permutations of the symbols (e.g. "A OR B AND C" ... "A AND C OR B"), implement operator precedence (i.e. AND before OR) and want to exclude formulation that produce the same True/False result for every input pattern, the logic can get quite complex:
Here's my take on it:
- I first view the operations as a binary tree which recursively combines a left and right side either with an OR or an AND operator.
- To implement permutations, the left side is formed of the powerset of combinations that use half of the values or less (with the right side using the remaining values
- I only use up to half of the values on the left side because if I used more than half, then the right side would contain combinations that have already been seen on the left, and since OR and AND are commutative, those additional distributions would only produce expressions that are equivalent
- When the number of symbols is even and both sides are equal sized, avoiding the mirroring distributions is handled using a set where the leftside combinations are stored and excluded from being used on the right side
- Still this will produce equivalent expressions that need to be excluded from the result
- Because of the recursive nature of the approach, I use a cache so that combinations of symbols are only computed once
This function tests if two expressions are equivalent. It does so by feeding each expression with all combinations of True/False patterns and evaluating them to compare results. If the two expressions produce the same result for all patterns, then they are equivalent.
def isEquivalent(A,B):
if A.count("and") != B.count("and") \
or A.count("or") != B.count("or"):
return False
symbols = A
for c in ("(",")"," or"," and"):
symbols = symbols.replace(c,"")
symbols = symbols.split()
different = compile(f"({A}) != ({B})","<string>","eval")
for n in range(1<<len(symbols)):
pattern = dict(zip(symbols,map(int,f"{n:0{len(symbols)}b}")))
if eval(different,pattern):
return False
return True
Caching of expression is based on the number of symbols. The cache contains the list of expressions with generic placeholders for the symbols which are mapped to the actual keys using string formatting. For example the result for keys XYZ are the same as those of ABC except that X, Y an Z take the places of the original A, B and C. The cache actually contains expressions like "{2} and {1} or {0}".
cache = dict()
def cacheResult(keys,result=None):
if not result:
return [ r.format(*keys) for r in cache.get(len(keys),[]) ]
cache[len(keys)] = resFormat = []
for r in result:
r = r.replace("and","&").replace("or","|")
for i,k in enumerate(keys):
r = r.replace(k,"{"+str(i)+"}")
r = r.replace("&","and").replace("|","or")
resFormat.append(r)
return result
This is the recursive function that generates the expressions. It adds parentheses as appropriate when ANDing sub-expressions that contain a top level OR operation.
from itertools import combinations
def boolCombo(keys):
if len(keys)==1: return list(keys)
result = cacheResult(keys)
if result: return result
def addResult(X):
if not any(isEquivalent(X,r) for r in result):
result.append(X)
seenLeft = set()
for leftSize in range(1,len(keys)//2+1):
for leftIdx in combinations(range(len(keys)),leftSize):
rightKeys = list(keys)
leftKeys = tuple(rightKeys.pop(i) for i in reversed(leftIdx))
rightKeys = tuple(rightKeys)
if len(leftKeys)==len(rightKeys):
if rightKeys in seenLeft: continue
seenLeft.add(leftKeys)
for left in boolCombo(leftKeys):
pleft = left
op = (p.count("(")==p.count(")") for p in left.split(" or "))
if " or " in left and any(op):
pleft = f"({left})"
for right in boolCombo(rightKeys):
addResult(f"{left} or {right}")
op = (p.count("(")==p.count(")")
for p in right.split(" or "))
if " or " in right and any(op):
right = f"({right})"
addResult(f"{pleft} and {right}")
return cacheResult(keys,result)
output:
symbols = "ABC"
for i,combo in enumerate(boolCombo(symbols),1):
print(i,combo)
1 A or B or C
2 A and (B or C)
3 A or B and C
4 A and B and C
5 B and (A or C)
6 B or A and C
7 C and (A or B)
8 C or A and B
len(boolCombo("ABCD")) # 52
len(boolCombo("ABCDE")) # 472
len(boolCombo("ABCDEF")) # 5504 takes 10 minutes to finish
[EDIT] Improved version
While fiddling to improve performance, I realized that the cases where IsEquivalent
would find matches were all due to operand ordering. And, since checking for equivalent expressions takes most of the processing time, I made sure to generate expressions with a constant ordering. This allowed me to do away with isEquivalent
because equivalent expressions would now produce the exact same string:
The cacheResult
function is the same but I added a sort on the results to make the output easier to read:
cache = dict()
def cacheResult(keys,result=None):
if not result:
return [ r.format(*keys) for r in cache.get(len(keys),[]) ]
cache[len(keys)] = resFormat = []
result = sorted(result,key=lambda x:x.replace(" "," "))
for r in result:
r = r.replace("and","&").replace("or","|")
for i,k in enumerate(keys):
r = r.replace(k,"{"+str(i)+"}")
r = r.replace("&","and").replace("|","or")
resFormat.append(r)
return result
The boolCombo
function now distinguishes between top level AND and OR so that the operands can be sorted in ascending order when the expressions is formed of multiple ORs or ANDs at the top precedence level. For example B and C and A
would be reordered to A and B and C
which can now be identified as duplicates simply on the basis of the string representation being the same. Using a set instead of a list for result automatically eliminates duplicates.
from itertools import combinations,product
or0,or1,and0,and1 = " or "," or "," and "," and "
def boolCombo(keys):
if len(keys)==1: return list(keys)
result = cacheResult(keys) or set()
if result: return result
def addResult(left,right):
OR = or0.join(sorted(left.split(or0)+right.split(or0)))
result.add(OR.replace(and0,and1))
if or0 in left: left = f"({left})"
if or0 in right: right = f"({right})"
AND = and0.join(sorted(left.split(and0)+right.split(and0)))
result.add(AND.replace(or0,or1))
seenLeft = set()
for leftSize in range(1,len(keys)//2+1):
for leftKeys in combinations(keys,leftSize):
rightKeys = [k for k in keys if k not in leftKeys]
if len(leftKeys)==len(rightKeys):
if tuple(rightKeys) in seenLeft: continue
seenLeft.add(tuple(leftKeys))
for left,right in product(*map(boolCombo,(leftKeys,rightKeys))):
addResult(left,right)
return cacheResult(keys,result)
output:
symbols = "ABC"
for i,combo in enumerate(boolCombo(symbols),1):
print(i,combo.replace(" "," "))
1 (A or B) and C
2 (A or C) and B
3 (B or C) and A
4 A and B and C
5 A and B or C
6 A and C or B
7 A or B and C
8 A or B or C
len(boolCombo("ABCD")) # 52
len(boolCombo("ABCDE")) # 472
len(boolCombo("ABCDEF")) # 5504
len(boolCombo("ABCDEFG")) # 78416
len(boolCombo("ABCDEFGH")) # 1320064 (12 seconds)
len(boolCombo("ABCDEFGHI")) # 25637824 (4.8 minutes)