I have managed to get this working with for-loops but these are very slow on the large datasets I'm working with, so am trying to find a way to do this using pandas, groupby, apply and lamda functions instead.
import pandas as pd
example_df = pd.DataFrame({"scen": [1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2],
"cusip": ['031162CF5', '031162CF5', '031162CF5', '031162CF5', '38141GWM2', '38141GWM2', '38141GWM2', '38141GWM2', '031162CF5', '031162CF5', '031162CF5', '031162CF5', '38141GWM2', '38141GWM2', '38141GWM2', '38141GWM2'],
"wal": [50, 55, 60, 65, 40, 50, 60, 70, 40, 45, 50, 55, 30, 40, 50, 60],
"par_val": [900000, 800000, 700000, 600000, 900000, 800000, 700000, 600000, 900000, 800000, 700000, 600000, 900000, 800000, 700000, 600000],
"prin_cf": [0, 100000, 100000, 100000, 0, 100000, 100000, 100000, 0, 100000, 100000, 100000, 0, 100000, 100000, 100000],
"amortization": [166.67, 0, 0, 0, 208.33, 0, 0, 0, 208.33, 0, 0, 0, 277.78, 0, 0, 0],
"book_val": [1000000, 0, 0, 0, 1000000, 0, 0, 0, 1000000, 0, 0, 0, 1000000, 0, 0, 0]})
for x in range(1, len(example_df['scen'])):
if (example_df['cusip'][x] == example_df['cusip'][x-1]):
# If bond matures, don't report book value
if(example_df['par_val'][x] == 0):
example_df['book_val'][x] = 0
else:
example_df['book_val'][x] = example_df['book_val'][x-1] - example_df['amortization'][x-1] - example_df['prin_cf'][x-1]
example_df['amortization'][x] = (example_df['book_val'][x] - example_df['par_val'][x]) / example_df['wal'][x] / 12
example_df
The tricky part is that each row's book value depends on the previous row's amortization value, while each amortization value depends on the book value in the same row. Looking at the responses to a similar question here, I think there may be a way to do this using global variables that keep track of the previous values.
Something like:
def calc_bv(prin_cf, par_val, wal):
global bvalue, amort
bvalue = bvalue - amort - prin_cf
amort = (bvalue - par_val)/wal/12
return bvalue, amort
bvalue = example_df.loc[0, 'book_val']
amort = example_df.loc[0, 'amortization']
example_df[1:][['book_val','amortization']] = example_df2[1:].apply(lambda row: calc_bv(row['prev_prin_cf'],row['par_val'],row['wal']), axis=1, result_type="expand")
example_df