0

I have a dataframe of the following type:

                                 dummy1  ret
Date       Index                            
1998-01-01 AAPL US                0.0  0.000
           AEX                    0.0  0.000
           AMZN US                0.0  0.000
           AS51                   0.0  0.000
           AS51 1Y 100 VOL BVOL   0.0    NaN
                               ...    ...
2016-12-29 W A Comdty             0.0    NaN
           WIV                    0.0  0.007
           WZ1 Comdty             0.0  0.005
           XLE US                 0.0 -0.002
           ZARUSD Curncy          0.0  0.018

The variable dummy1 is 1 on specific dates which are not always the same for the various identifiers contained in the Indexlayer of the Multiindex, and zero otherwise. My goal is to create a new dummy variable (or indicator variable) say dummy2 which is 1 on the previous business day when dummy1 is equal to 1. So, dummy2 must equal 1 the day before dummy1 equals 1. How can I do it? If I had to extract the list of dates in which dummy1 is equal to 1, using pd.offest.BusinessDays(n=1) could be an idea but my concern is that then I would lose track of the index of the dataframe, which is crucial because, as mentioned above, dummy1 is 1 not at the same time (i.e. date) for different values of Index.

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
user9875321__
  • 195
  • 12
  • Something like https://stackoverflow.com/questions/52435070/how-to-lag-data-by-x-specific-days-on-a-multi-index-pandas-dataframe: Merge the DataFrame with itself after you add 1 Business Day to Date in the right Frame. – ALollz May 06 '20 at 15:37
  • @ALollz I am afraid it would not work. I am using a specific dummy variable here and what I need to make sure is tat it is 1 one day before that the old dummy variable was 1. I do not think it works the same as in the example – user9875321__ May 06 '20 at 15:55

1 Answers1

0

I actually found a very handy and elegant solution based on the shift method. If df is our dataframe:

df.reset_index(inplace=True)
df['dummy2']=df.groupby(['Index'])['dummy1'].shift(-1)
df.set_index(['Date', 'Index'], inplace=True)

In this way I can simply shift one day before the date of the dummy1 keeping track of the differences in the Index layer of the Multiindex

user9875321__
  • 195
  • 12