7

Consider that we have the following input

formula = "(([foo] + [bar]) - ([baz]/2) )"

function_mapping = {
                   "foo" : FooFunction,
                   "bar" : BarFunction,
                   "baz" : BazFunction,  
                  }

Is there any python library that lets me parse the formula and convert it into a python function representation.

eg.

converted_formula = ((FooFunction() + BarFunction() - (BazFunction()/2))

I am currently looking into something like

In [11]: ast = compiler.parse(formula)

In [12]: ast
Out[12]: Module(None, Stmt([Discard(Sub((Add((List([Name('foo')]), List([Name('bar')]))), Div((List([Name('baz')]), Const(2))))))]))

and then process this ast tree further.

Do you know of any cleaner alternate solution? Any help or insight is much appreciated!

martineau
  • 119,623
  • 25
  • 170
  • 301
  • One potential problem with using `compiler.parse()` is that it parses according to Python syntax, which is why it turned the `[foo]` in the formula into `List([Name('foo')])`. What is the syntax being used in the formulas? – martineau Apr 01 '16 at 09:20
  • @martineau True, there is a problem with the formula structure sicne it collides with the python List type. I define the syntax of the formulas, hence can give something like `In [18]: formula = "((foo + bar) - (baz/2) )" In [19]: ast = compiler.parse(formula) In [20]: ast Out[20]: Module(None, Stmt([Discard(Sub((Add((Name('foo'), Name('bar'))), Div((Name('baz'), Const(2))))))]))` – alchemist111 Apr 01 '16 at 18:24
  • You may be able to sidestep the syntax issue with `compiler.parse()` by "simply" doing text substitution as I've showm in my [answer](http://stackoverflow.com/a/36354572/355230) below. That said, it might be better to define the syntax of your formula expressions so it doesn't conflict with existing Python syntax. For example, instead of the `re` module, it might be possible to do what you want using [`string.Template`](https://docs.python.org/2/library/string.html#string.Template.template) substitution `$` syntax which might be easier to understand and implement. – martineau Apr 01 '16 at 19:02

3 Answers3

4

You could use the re module to do what you want via regular-expression pattern matching and relatively straight-forward text substitution.

import re

alias_pattern = re.compile(r'''(?:\[(\w+)\])''')

def mapper(mat):
    func_alias = mat.group(1)
    function = function_alias_mapping.get(func_alias)
    if not function:
        raise NameError(func_alias)
    return function.__name__ + '()'

# must be defined before anything can be mapped to them
def FooFunction(): return 15
def BarFunction(): return 30
def BazFunction(): return 6

function_alias_mapping =  dict(foo=FooFunction, bar=BarFunction, baz=BazFunction)
formula = "(([foo] + [bar]) - ([baz]/2))"  # Custom formula.

converted_formula = re.sub(alias_pattern, mapper, formula)
print('converted_formula = "{}"'.format(converted_formula))

# define contexts and function in which to evalute the formula expression
global_context = dict(FooFunction=FooFunction,
                      BarFunction=BarFunction,
                      BazFunction=BazFunction)
local_context = {'__builtins__': None}

function = lambda: eval(converted_formula, global_context, local_context)
print('answer = {}'.format(function()))  # call function

Output:

converted_formula = "((FooFunction() + BarFunction()) - (BazFunction()/2))"
answer = 42
martineau
  • 119,623
  • 25
  • 170
  • 301
  • This works pretty well! Thank you! I was steering away from using `eval` since I was worried about the security implications. As long as I validate the input thoroughly, do you think this would be a valid use case for `eval` ? – alchemist111 Apr 01 '16 at 18:30
  • What do you mean "pretty well"? `eval` is OK if you take certain precautions — see updated answer. – martineau Apr 01 '16 at 18:43
  • 1
    I think I wanted to say 'Works Perfectly!' :) Thanks for the update! Will make sure that I take the necessary precautions before using eval. – alchemist111 Apr 01 '16 at 20:16
0

You can use what's called string formatting to accomplish this.

function_mapping = {
                   "foo" : FooFunction(),
                   "bar" : BarFunction(),
                   "baz" : BazFunction(),  
                  }

formula = "(({foo} + {bar}) - ({baz}/2) )".format( **function_mapping )

Will give you the result of ((FooFunction() + BarFunction() - (BazFunction()/2))

But I believe the functions will execute when the module is loaded, so perhaps a better solution would be

function_mapping = {
                   "foo" : "FooFunction",
                   "bar" : "BarFunction",
                   "baz" : "BazFunction",  
                  }

formula = "(({foo}() + {bar}()) - ({baz}()/2) )".format( **function_mapping )

This will give you the string '((FooFunction() + BarFunction() - (BazFunction()/2))' which you can then execute at any time with the eval function.

hostingutilities.com
  • 8,894
  • 3
  • 41
  • 51
  • The parenthesis of functions are missing in the second example. Other detail: if a function is present multiple times in the formula, it will be called multiple times. [LRU](https://docs.python.org/3/library/functools.html#functools.lru_cache), dirty flag or caching can be useful here. – aluriak Apr 01 '16 at 07:27
  • @aluriak I did not know there was a built in way to automatically memoize functions for you. That is way cool. And I've updated my answer to include the parenthesis as well. Thanks for pointing that out. – hostingutilities.com Apr 01 '16 at 07:33
  • 2
    "This will give you the string '((FooFunction() + BarFunction() - (BazFunction()/2))'", actually no.... this is completely wrong.... – Netwave Apr 01 '16 at 07:35
  • @Mr.Me : Daniel is right, the string you get will be something like `In [10]: formula Out[10]: '((() + ()) - (()/2) )'` . Thank you though! – alchemist111 Apr 01 '16 at 18:26
  • I really should fully test my code before posting it. Putting the function name in a string will fix this. If you're not able to hard code this in, there are ways of getting the string name of any function. See http://stackoverflow.com/questions/251464/how-to-get-a-function-name-as-a-string-in-python for info on how to do that. – hostingutilities.com Apr 01 '16 at 18:44
  • Yes, you really should test your code before posting it (and making false claims about what it does). In some programming methodologies the first thing you're supposed to do is write the test code. – martineau Apr 01 '16 at 19:19
0

If you change the syntax used in the formulas slightly, (another) way to do this — as I mentioned in a comment — would be to use string.Template substitution.

Out of curiosity I decided to find out if this other approach was viable — and consequently was able to come up with better answer in the sense that not only is it simpler than my other one, it's also a little more flexible in the sense that it would be easy to add arguments to the functions being called as noted in a comment below.

from string import Template

def FooFunction(): return 15
def BarFunction(): return 30
def BazFunction(): return 6

formula = "(($foo + $bar) - ($baz/2))"

function_mapping = dict(foo='FooFunction()',  # note these calls could have args
                        bar='BarFunction()',
                        baz='BazFunction()')

converted_formula = Template(formula).substitute(function_mapping)
print('converted_formula = "{}"'.format(converted_formula))

# define contexts in which to evalute the expression
global_context = dict(FooFunction=FooFunction,
                      BarFunction=BarFunction,
                      BazFunction=BazFunction)
local_context = dict(__builtins__=None)
function = lambda: eval(converted_formula, global_context, local_context)

answer = function()  # call it
print('answer = {}'.format(answer))

As a final note, notice that string.Template supports different kinds of Advanced usage which would allow you to fine-tune the expression syntax even further — because internally it uses the re module (in a more sophisticated way than I did in my original answer).

For the cases where the mapped functions all return values that can be represented as Python literals — like numbers — and aren't being called just for the side-effects they produce, you could make the following modification which effectively cache (aka memoize) the results:

function_cache = dict(foo=FooFunction(),  # calls and caches function results
                      bar=BarFunction(),
                      baz=BazFunction())

def evaluate(formula):
    print('formula = {!r}'.format(formula))
    converted_formula = Template(formula).substitute(function_cache)
    print('converted_formula = "{}"'.format(converted_formula))
    return eval(converted_formula, global_context, local_context)

print('evaluate(formula) = {}'.format(evaluate(formula)))

Output:

formula = '(($foo + $bar) - ($baz/2))'
converted_formula = "((15 + 30) - (6/2))"
evaluate(formula) = 42
Community
  • 1
  • 1
martineau
  • 119,623
  • 25
  • 170
  • 301