3

What would be the best way to programmatically translate a string like

"((abc&(def|ghi))|jkl)&mno"

to be executed as as:

if ((func('abc') and (func('def') or func('ghi'))) or func('jkl')) and func('mno'):
    return True

I feel like there must be a simple way to achieve this, but I can't get my head around it.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
Julian F
  • 53
  • 7
  • This only solves a part of the problem as it does not account for '&' and '|' in the string. – Julian F Oct 10 '20 at 13:11
  • StackOverflow is not a code writing service. Please edit your code to try solve this problem into your question. – DisappointedByUnaccountableMod Oct 10 '20 at 13:24
  • If I knew how to solve this problem I would not have asked this question in the first place.The linked thread has not changed that as it only accounted for a very small part of the problem. – Julian F Oct 10 '20 at 13:40
  • I said “try” - you should be showing an honest attempt to solve the problem. Read https://stackoverflow.com/help/dont-ask – DisappointedByUnaccountableMod Oct 10 '20 at 17:59
  • I don't see how "trying" should help solve this problem when I am a asking for a best practive solution (as provided by @PaulMcG, thank you!) for a complex problem that requires a lot of knowledge and experience with very specific technologies. This is _exactly_ what stackoverflow is for. – Julian F Oct 27 '20 at 21:24
  • If your point is that you can and should learn anything on your own without asking questions whatsoever, what exactly would be the point of this platform? – Julian F Oct 28 '20 at 10:18
  • "Try" was my point - yes you won't always succeed and then search/research - for example there are very many examples out there of arithmetic expression parsers written in python of which you mention none in your question, and mention nothing of research - and as final resort ask here - see this accepted answer https://meta.stackoverflow.com/questions/261592/how-much-research-effort-is-expected-of-stack-overflow-users – DisappointedByUnaccountableMod Oct 28 '20 at 14:29
  • Well yes exactly because you didn't show any research no reader of your question knows how much (if any) you did: so show it in the question. Why don't you read https://stackoverflow.com/help/how-to-ask which says (my emphasis) "..and keep track of what you find. Even if you don't find a useful answer elsewhere on the site, _including links to related questions that haven't helped [i.e. your research]_ can help others in understanding how your question is different from the rest." No need to be rude - your question is public, anyone can respond. – DisappointedByUnaccountableMod Oct 28 '20 at 17:08

2 Answers2

2

Well, you can parse it with some simple regexes replacing matches as needed, if your string is not more complicated than what you show (e.g. is only composed of those symbols plus letters/numbers). After that, you can just use eval() to run it as python code.

For example:

import re

def func(x):
    # just an example...
    return True

s = "((abc&(def|ghi))|jkl)&mno"

s = re.sub(r'(\w+)', r"func('\1')", s)
s = s.replace('&', ' and ')
s = s.replace('|', ' or ')

print(s)
print(eval(s))

Output:

((func('abc') and (func('def') or func('ghi'))) or func('jkl')) and func('mno')
True
Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
  • 1
    This does indeed solve the problem. However, this solutions seems to be very error prone in case of a misformatted input string and using eval() is generally considered bad practive (https://stackoverflow.com/questions/1832940/why-is-using-eval-a-bad-practice) so I think there should exist better solutions to this problem. – Julian F Oct 10 '20 at 13:24
  • @JulianFreyberg using `eval()` is bad practice only if you don't know what you are doing and you use it on user input. The function exists for a reason: to be used. Of course, the correct approach is to write a recursive parser, but that's going to be a lot more complicated than this simple and fast to write solution, which is why I added that "disclaimer" before the code. – Marco Bonelli Oct 10 '20 at 13:27
  • I totally agree with you and will use your solution in my case, hence I accepted your answer. I just wanted to point out that there probably exist more failsafe solutions that might be more appropriate in other cases. – Julian F Oct 10 '20 at 13:38
2

This is an interesting little problem, with a number of layers to a solution.

First off, given this sample, you need a basic infix notation parser. In pyparsing, there is a builtin helper method infixNotation. Several pyparsing examples show how to parse a boolean expression using infixNotation. Here is a parser that will parse your sample expression:

import pyparsing as pp

term = pp.Word(pp.alphas)
AND = pp.Literal("&")
OR = pp.Literal("|")
expr =  pp.infixNotation(term,
                         [
                             (AND, 2, pp.opAssoc.LEFT,),
                             (OR, 2, pp.opAssoc.LEFT,),
                         ])

print(expr.parseString(sample).asList())

For your sample, this will print:

[[[['abc', '&', ['def', '|', 'ghi']], '|', 'jkl'], '&', 'mno']]

You can see that we have captured not only the expression, but also the grouping by parentheses.

We can start to do the conversion to your desired output by adding parse actions. These are parse-time callbacks that pyparsing will call, to replace the parsed tokens with a different value (which need not be a string, could be an AST node for evaluation - but in this case we will return a modified string).

AND.addParseAction(lambda: " and ")
OR.addParseAction(lambda: " or ")
term.addParseAction(lambda t: "func('{}')".format(t[0]))
expr.addParseAction(lambda t: "({})".format(''.join(t[0])))

Parse actions can be methods with various signatures:

function()
function(tokens)
function(location, tokens)
function(input_string, location, tokens)

For AND and OR, we only need to replace the parsed operators with their corresponding "and" and "or" keywords. For the parsed variable terms, we want to change "xxx" to "func(xxx)", so we write a parse action that takes the parsed tokens, and returns modified string.

The parse action for expr is interesting because all it does is take the parsed contents, join them using ''.join(), and then wrap that in ()s. Since expr is actually a recursive expression, we will see that it does the proper wrapping in ()'s at each level in the parsed nested list.

After adding these parse actions, we can try calling parseString() again, now giving:

["(((func('abc') and (func('def') or func('ghi'))) or func('jkl')) and func('mno'))"]

Getting close!

To do the formatting into your desired if statement, we can use another parse action. But we can't attach this parse action directly to expr, since we saw that expr (and its associated parse action) will get parsed at all levels of nesting. So instead, we can create an "outer" version of expr, that is simply a container expression of an expr:

outer_expr = pp.Group(expr)

The parse action is similar to what we saw for expr, where we return a new string using the input tokens:

def format_expression(tokens):
    return "if {}:\n    return True".format(''.join(tokens[0]))

outer_expr.addParseAction(format_expression)

Now we use outer_expr to parse the input string:

print(outer_expr.parseString(sample)[0])

Getting:

if (((func('abc') and (func('def') or func('ghi'))) or func('jkl')) and func('mno')):
     return True

(There might be an extra set of ()'s on this value, they could be removed in the parse action for outer_expr if desired.)

Finished version of the parser (uncomment the intermediate print statements to see the progression of the parser functionality):

sample = "((abc&(def|ghi))|jkl)&mno"

import pyparsing as pp

term = pp.Word(pp.alphas)
AND = pp.Literal("&")
OR = pp.Literal("|")
expr =  pp.infixNotation(term,
                         [
                             (AND, 2, pp.opAssoc.LEFT,),
                             (OR, 2, pp.opAssoc.LEFT,),
                         ])

# print(expr.parseString(sample).asList())

AND.addParseAction(lambda: " and ")
OR.addParseAction(lambda: " or ")
term.addParseAction(lambda t: "func('{}')".format(t[0]))
expr.addParseAction(lambda t: "({})".format(''.join(t[0])))

# print(expr.parseString(sample).asList())

def format_expression(tokens):
    return "if {}:\n    return True".format(''.join(tokens[0]))

outer_expr = pp.Group(expr).addParseAction(format_expression)
print(outer_expr.parseString(sample)[0])
PaulMcG
  • 62,419
  • 16
  • 94
  • 130