I have a dataframe and a series of rates. My calculation is very simple:
new_row(n) = new_row(n-1)*rate + old_row(n)
I have 20 Columns in my dataframe. rate is a series of 20 (1 for each column). I have written a code using loops which take nearly 9 seconds to run. I believe that, it is not the ideal way of doing this exercise. I would like to find a Pythonic way of doing it.
data = pd.read_csv('data.csv')
ret_rate = pd.read_csv('Retention_Rate.csv')
ret_dat = data.copy()
for i in range(4, ret_dat.shape[1]):
for j in range(1, ret_dat.shape[0]):
if (ret_dat['MARKET_ID'][j] == ret_dat['MARKET_ID'][j-1]):
ret_dat.iloc[j, i] = ret_dat.iloc[j, i] + ret_rate.iloc[i-4,0]*ret_dat.iloc[j-1, i]
ret_dat.to_csv('adstock_data_v3.csv')
I have put the data in a Google sheet.