I am trying to build a generic cash flow model in python / pandas, which would be made of expressions such as
'Profit=Revenue-Cost'
'Capital=Capital[T-1]+Profit'
'3YearProfit=Profit[T-2:].sum()'
'Capital[0]=100'
etc.
The model would be built into a pandas dataframe df where each row represents a period of the cash flow (eg months, or quarters). The first row refers to the first period (Time 0), the second one to the next period, and so on.
Each of the above expressions would need to be translated into an iteration over the rows the dataframe such as
for ndx in df.index:
df.loc[df.index[ndx],'Profit']=df.loc[df.index[ndx],'Revenue']-df.loc[df.index[ndx],'Cost']
df.loc[df.index[ndx],'Capital']=df.loc[df.index[ndx-1],'Capital']+df.loc[df.index[ndx],'Profit']
df.loc[df.index[ndx],'3YearProfit']=df.loc[df.index[ndx-2:],'Profit'].sum()
the for loop would update all the variables according to the expressions, starting from T=0 up to the last period (last row) in the dataframe.
Ideally the regex parser would identify in the string expression those strings that would be considered variable names, with the additional option of there being a [x] or [x:y] suffix, with x, y both <0 and meaning number of periods prior to the current one. If the suffix is absent, it is assumed to refer to time T, i.e. equivalent to [0].
To avoid confusions, the allowed model variables should probably be declared within a modelVarList Series or Dictionary so that only strings defined within modelVarList are tried by the regex
Ideally, the expression could be quite complex so that it would go from [expression] to [parsedExpression] with replacements having been made to eval(parsedExpression) to produce the actual outcome. Therefore a parser that recognises Python / numpy / pandas keywords and syntax would be much more helpful than simply recognizing few operators such as +, -, *, /
Is there such a parser available, how do I access it and any clues on how to set up the process?