0

I have the following dataframe. How do I create a new column that has the monthly return?

Exhibit A:

Scenario TimeStep CumReturn
1 0 1
1 1 1.05
1 2 1.07
2 0 1
2 1 1.04
2 2 1.02

Exhibit B:

Scenario TimeStep CumReturn
1 0 .05
1 1 .019
1 2 na
2 0 .04
2 1 -.019
2 2 na

Where row n is (n-1 / n)-1 for each change in time, stopping at the end of each scenario.

The end goal is to do this iteratively for cumulative indeces, then create a linearly combination from the individual returns, then RE roll to a new cumulative number.

I am doing this in pandas.

So far I have this code:

  for idx,wght, i in zip( indeces,fundweights , range(len(indeces)-1)):
        df_temp = pd.read_csv(filepath + file_names[idx],header=None)
        df_temp = df_temp.stack()
        df_temp.index.names=['Scen','TIME']
        df_temp= df_temp.to_frame(name='CumGrowth_idx')
        df_temp.reset_index(inplace=True)
        df_temp['Scen'] = df_temp['Scen'] + 1 
        df_temp[idx] = idx
        df_temp = df_temp.rename(columns={'TIME':'Month'})
        df_temp["Monthly_Return"]= 

I want to create the monthly return dataframe, then append these all to a dataframe, take a linear combination. Can anyone offer some input?

Thanks

ti7
  • 16,375
  • 6
  • 40
  • 68
N27
  • 31
  • 5
  • 2
    Please update your input samples to be properly formatted. See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for best practices related to Pandas questions. – itprorh66 Jun 01 '23 at 20:49

2 Answers2

1

While there is probably a much more elegant way to accomplish this with some of Pandas' merge and group methods, here is a method that will work.

First of all define a function to split the dataframe by Scenario Indexs, then update the cumReturn column using the shift function, finally concat the separate frames back into 1 frame as follows:

import pandas as pd

def computeDifference(df: pd.DataFrame)-> pd.DataFrame:
    scnslist = df['Scenario'].unique()
    rslt = df[df['Scenario'] == scnslist[0]].copy()
    rslt['CumReturn'] = (rslt['CumReturn'].shift(-1)/rslt['CumReturn'])-1
    for i in range(1, len(scnslist)):
        dx = df[df['Scenario'] == scnslist[i]].copy()
        dx['CumReturn'] = (dx['CumReturn'].shift(-1)/dx['CumReturn'])-1
        rslt = pd.concat([rslt, dx], axis=0, ignore_index=True)
    return rslt

When given the dataframe in Exhibit 1, defined as df you can execute computeDifference(df) which yields:

    Scenario    TimeStep    CumReturn
0   1   0   0.050000
1   1   1   0.019048
2   1   2   NaN
3   2   0   0.040000
4   2   1   -0.019231
5   2   2   NaN
itprorh66
  • 3,110
  • 4
  • 9
  • 21
  • I was able to accomplish this by using shift however I did not need to loop.. as I used group by for the scenarios. – N27 Jun 06 '23 at 17:58
0
 df_temp = pd.read_csv(filepath + file_names_equity[idx],header=None)
        df_temp = df_temp.stack()
        df_temp.index.names=['Scen','TIME']
        df_temp= df_temp.to_frame(name='CumGrowth_idx')
        df_temp.reset_index(inplace=True)
        df_temp['Scen'] = df_temp['Scen'] + 1 
        df_temp[idx] = idx
        df_temp = df_temp.rename(columns={'TIME':'Month'})
        
        df_temp["Monthly_Return"]= (df_temp["CumGrowth_idx"].div(df_temp["CumGrowth_idx"].shift(1)) - 1).shift(-1)
        df_temp["Monthly_Return"] = np.where(df_temp['Month'] == 600,0, df_temp['Monthly_Return'] )
N27
  • 31
  • 5