I have a very simple excel spreadsheet with a table that goes something like this:
cash | Margin |
---|---|
0 | 0.45 |
=if(B2>0.5, A2, 1) | =if(A3>0.2,1,0) |
Some cells calling other cells on the same rows or on previous rows. The idea is to generate a table like this a few hundred of thousands of times with diff inputs/params.
I did that first using VBA with arrays. Setting the first-row value and then looping through the rest which takes quite long. With a bit of optimization, I ran the simulation in 3 hours.
I then moved on to python using pandas thinking it would be much faster. But again, because of the dependency between one cell and the cell above I have to use for loops. A simple df['margin']=df['blah']*margin_rate does not work.
My code therefore looks something like this:
df.loc[0, 'cash collateral'] = 0
df.loc[0, 'release'] = 1 if df.loc[0, 'LTV'] > max_value else 0
...
for i in range(1, len(df)):
if df.loc[i-1, 'call'] == 1:
df.loc[i, 'release'] = df.loc[i-1, 'release']
...
It works! But it's hardly faster than VBA. Do you know how I could fasten it up? Maybe some recursive functions, or using NumPy over pandas?
Thanks for taking the time to read!