1

I have a Django app that's a calculator. Users configure an arbitrarily deep calculation (think an Excel formula) on one screen, and then enter (cell) data on another.

After linking the fields to their values, I end up with a formula of the form

SUM(1,2,4)

Which may be arbitrarily deep, e.g.

SUM(1,SUM(5,DIFFERENCE(6,DIVISION(8,10),7),4),2)

One of the formulas giving me a headache is one of the more complicated the users input in our system:

ROUND(MULTIPLICATION(DIVISION(ROUND(SUM(MULTIPLICATION(50.00000,50.00000),MULTIPLICATION(50.00000,50.00000,2),MULTIPLICATION(50.00000,50.00000)),-2),300),DIFFERENCE(DIFFERENCE(41,7),0.3)),0)

I'm using pyparsing to parse the formula and extract the values and nested formulas, and recursively performing the calculations. The problem is that I'm running into the recursion limit with pyparsing due to parsing each nested calculation.

My recursion code:

class Calculator:
def __init__(self, formula=None):
    self.formula = formula

def do_calculation(self):
    # parse the formula we receive, which returns arguments in groups of numbers and nested calculations
    # e.g. SUM(MULTIPLICATION(12,11),1,5)
    parsed_formula = FormulaParser(self.formula).get_parsed_formula()

    #calculation name is the outermost level calculation
    calc_name = parsed_formula['calculation_name']
    #don't stomp on built in round
    if 'ROUND' in "".join(calc_name):
        calc_name = ["ROUND_CALCULATION"]
    #don't stomp on if
    if 'IF' == "".join(calc_name):
        calc_name = ['IF_STATEMENT']
    #grab the name of the calculation, will match a function name below
    ex = getattr(self, string.lower("".join(calc_name)))
    calc_arguments = []
    #formulas need to be recursively executed
    formulas = parsed_formula.args.formulas.asList() if len(parsed_formula.args.formulas) else []
    #numbers are just added to the arguments
    dnumbers = parsed_formula.args.dnumbers.asList() if len(parsed_formula.args.dnumbers) else []

    for arg in parsed_formula.args.asList():
        if arg in dnumbers:
            calc_arguments.append(''.join(arg))
        elif arg in formulas:
            new_calc = Calculator(''.join(self.flatten(arg[:])))
            calc_arguments.append(new_calc.do_calculation())

    #execute the calculation with the number arguments
    for idx, arg in enumerate(calc_arguments):
        if isinstance(arg, dict) and arg['rounding']:
            calc_arguments[idx] = arg['result']
    result = ex(*calc_arguments)
    #for rounding, output is special to tell the api to not format to default 5 decimal places
    if 'ROUND' in "".join(calc_name):
        return dict(result=result, rounding=True)
    return result

# function called on nested calculations that may have other nested calculations to flatten to a single level list
@staticmethod
def flatten(expr):
    for i, x in enumerate(expr):
        while isinstance(expr[i], list):
            expr[i:i + 1] = expr[i]
    return expr

And the parser for formulas:

class FormulaParser():
def __init__(self, formula=None):
    self.formula = formula

    # grammar
    # end result
    expr = Forward()
    formula = Forward()

    #calculation keywords
    calc_keyword = lambda name: Keyword(name)
    calculations = [calc_keyword(calc) for calc in CALCULATION_TYPES]
    calc_name = Group(reduce(lambda y, z: y | z, [x for x in calculations])).setResultsName('calculation_name')

    #symbols
    oparen, cparen, comma, dot, minus = map(Literal, '(),.-')
    dnumber = Combine(Optional(minus) + Word(nums) + Optional(dot + Word(nums)))

    #possible formulas
    expr = Group(formula).setResultsName('formulas', listAllMatches=True) | Group(dnumber).setResultsName(
        'dnumbers', listAllMatches=True)
    exprs = expr + ZeroOrMore(comma + expr)

    #entire formula
    formula << Combine(calc_name + Group(oparen + exprs + cparen).setResultsName('args'))
    self.parsed_formula = formula

def get_parsed_formula(self):
    if self.formula:
        return self.parsed_formula.parseString(self.formula)

    return None

I've already refactored other recursive parts of my app into iteration using the stack approach in this SO answer.

While that's doable in linking field definitions to user input, I'm having more trouble wrapping my head around how to execute a calculation once it boils down to only arguments, then passing the results to the next stack level, and so on.

Community
  • 1
  • 1
Michal
  • 995
  • 6
  • 10

1 Answers1

2

I'm not sure this will help you get your head wrapped around recursing through the parsed stack. If you just want to evaluate the expression, then you can take care of everything using parse actions, and evaluate as you go at parse time. See the embedded comments in my mods to your supplied source code:

sample = """ROUND(MULTIPLICATION(DIVISION(ROUND(SUM(MULTIPLICATION(50.00000,50.00000),MULTIPLICATION(50.00000,50.00000,2),MULTIPLICATION(50.00000,50.00000)),-2),300),DIFFERENCE(DIFFERENCE(41,7),0.3)),0)"""

from pyparsing import *

CALCULATION_TYPES = "ROUND MULTIPLICATION DIVISION SUM DIFFERENCE".split()

functionMap = {
    "ROUND"          : lambda args: round(args[0]),
    "MULTIPLICATION" : lambda args: args[0]*args[1],
    "DIVISION"       : lambda args: args[0]/args[1],
    "SUM"            : lambda args: args[0]+args[1],
    "DIFFERENCE"     : lambda args: args[0]-args[1],
    }

class FormulaParser():
    def __init__(self, formula=None):
        self.formula = formula

        # grammar
        # end result
        expr = Forward()
        formula = Forward()

        #calculation keywords
        calc_keyword = lambda name: Keyword(name)
        calculations = [calc_keyword(calc) for calc in CALCULATION_TYPES]
        calc_name = Group(reduce(lambda y, z: y | z, [x for x in calculations])).setResultsName('calculation_name')

        # a simpler way to create a MatchFirst of all your calculations
        # also, save the results names for when you assemble small elements into larger ones
        calc_name = MatchFirst(calculations)

        #symbols
        oparen, cparen, comma, dot, minus = map(Literal, '(),.-')
        #dnumber = Combine(Optional(minus) + Word(nums) + Optional(dot + Word(nums)))
        # IMPORTANT - convert numbers to floats at parse time with this parse action
        dnumber = Regex(r'-?\d+(\.\d+)?').setParseAction(lambda toks: float(toks[0]))

        #possible formulas
        #expr = Group(formula).setResultsName('formulas', listAllMatches=True) |
        #       Group(dnumber).setResultsName('dnumbers', listAllMatches=True)
        #exprs = expr + ZeroOrMore(comma + expr)

        #entire formula
        #formula << Combine(calc_name + Group(oparen + exprs + cparen).setResultsName('args'))
        #self.parsed_formula = formula

        # define what is allowed for a function arg
        arg_expr = dnumber | formula
        def eval_formula(tokens):
            fn = functionMap[tokens.calculation_name]
            return fn(tokens.args)

        # define overall formula, and add results names here
        formula <<= (calc_name("calculation_name") + oparen 
                                        + Optional(delimitedList(arg_expr))('args') 
                                        + cparen).setParseAction(eval_formula)
        self.parsed_formula = formula


    def get_parsed_formula(self):
        if self.formula:
            return self.parsed_formula.parseString(self.formula)

        return None

fp = FormulaParser(sample)
print fp.get_parsed_formula()
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • Thanks. This is a nice solution. In the end, I decided against mixing concerns though. I was overthinking things. I simplified the parser to just the tokens I was interested in, and used a simple stack implementation to do the calculations. – Michal Mar 26 '15 at 19:47