0

I am trying to build a generic cash flow model in python / pandas, which would be made of expressions such as

'Profit=Revenue-Cost'

'Capital=Capital[T-1]+Profit'

'3YearProfit=Profit[T-2:].sum()'

'Capital[0]=100'

etc.

The model would be built into a pandas dataframe df where each row represents a period of the cash flow (eg months, or quarters). The first row refers to the first period (Time 0), the second one to the next period, and so on.

Each of the above expressions would need to be translated into an iteration over the rows the dataframe such as

for ndx in df.index: 
    df.loc[df.index[ndx],'Profit']=df.loc[df.index[ndx],'Revenue']-df.loc[df.index[ndx],'Cost']
    df.loc[df.index[ndx],'Capital']=df.loc[df.index[ndx-1],'Capital']+df.loc[df.index[ndx],'Profit']
    df.loc[df.index[ndx],'3YearProfit']=df.loc[df.index[ndx-2:],'Profit'].sum()

the for loop would update all the variables according to the expressions, starting from T=0 up to the last period (last row) in the dataframe.

Ideally the regex parser would identify in the string expression those strings that would be considered variable names, with the additional option of there being a [x] or [x:y] suffix, with x, y both <0 and meaning number of periods prior to the current one. If the suffix is absent, it is assumed to refer to time T, i.e. equivalent to [0].

To avoid confusions, the allowed model variables should probably be declared within a modelVarList Series or Dictionary so that only strings defined within modelVarList are tried by the regex

Ideally, the expression could be quite complex so that it would go from [expression] to [parsedExpression] with replacements having been made to eval(parsedExpression) to produce the actual outcome. Therefore a parser that recognises Python / numpy / pandas keywords and syntax would be much more helpful than simply recognizing few operators such as +, -, *, /

Is there such a parser available, how do I access it and any clues on how to set up the process?

GIG
  • 41
  • 1
  • 4
  • First you would need to define a complete syntax for your input, then it's maybe parsable via regex – user8408080 Dec 19 '18 at 17:49
  • my expectation is that the same python syntax applies, hence an eval() could evaluate the expression. The only syntax I am hopefully adding is the varName[-x:-y] that gets expanded into df.loc[df.index[ndx-x:ndx-y],varName] – GIG Dec 19 '18 at 17:58
  • This should all in all be possible, if it's not allowed for example to go deeper than one level of parenthesis. I would not recommend to use `eval` though, as it calls *everything* you put in there. Not very safe, even if you don't distribute this to others. Instead, maybe look at [PyParsing](https://stackoverflow.com/a/2371789) – user8408080 Dec 19 '18 at 18:52

0 Answers0