0

currently trying to figure this out with code like this..but not quite yet able to get it:

df[(df.index.day_of_week==0) & (df.index.day<15) & (df.shift(-4).index.day_of_week==4)]

this is what the data looks like. (i've added the day_of_week column for convenience). basically, i am trying to find the first day_of_week=0 (monday) in the month, then filter for the first day_of_week=4 after that (friday)

                  close  day_of_week
date                                
2022-07-01  3825.330078            4
2022-07-05  3831.389893            1
2022-07-06  3845.080078            2
2022-07-07  3902.620117            3
2022-07-08  3899.379883            4
2022-07-11  3854.429932            0
2022-07-12  3818.800049            1
2022-07-13  3801.780029            2
2022-07-14  3790.379883            3
2022-07-15  3863.159912            4
...
2022-08-01  4118.629883            0
2022-08-02  4091.189941            1
2022-08-03  4155.169922            2
2022-08-04  4151.939941            3
2022-08-05  4145.189941            4
2022-08-08  4140.060059            0
2022-08-09  4122.470215            1
2022-08-10  4210.240234            2
2022-08-11  4207.270020            3
2022-08-12  4280.149902            4
...
2022-09-01  3966.850098            3
2022-09-02  3924.260010            4
2022-09-06  3908.189941            1
2022-09-07  3979.870117            2
2022-09-08  4006.179932            3
2022-09-09  4067.360107            4
2022-09-12  4110.410156            0
2022-09-13  3932.689941            1
2022-09-14  3946.010010            2
2022-09-15  3901.350098            3
2022-09-16  3873.330078            4
...
2022-10-03  3678.429932            0
2022-10-04  3790.929932            1
2022-10-05  3783.280029            2
2022-10-06  3744.520020            3
2022-10-07  3639.659912            4
2022-10-10  3612.389893            0
2022-10-11  3588.840088            1
2022-10-12  3577.030029            2
...
2022-11-01  3856.100098            1
2022-11-02  3759.689941            2
2022-11-03  3719.889893            3
2022-11-04  3770.550049            4
2022-11-07  3806.800049            0
2022-11-08  3828.110107            1

This should return:

2022-07-15  3863.159912            4
2022-08-05  4145.189941            4
2022-09-16  3873.330078            4
2022-10-07  3639.659912            4

EDIT: while i dont expect this to work, curious as to why this returns no results? is shifting not supported when filtering in this manner

df[(df.index.day_of_week==0) & (df.index.day<15) & (df.shift(-4).index.day_of_week==4)]
mike01010
  • 5,226
  • 6
  • 44
  • 77
  • 1
    It'd help to post some example data and the output. See [How to make good reproducible pandas examples](/q/20109391/4518341) – wjandrea Nov 08 '22 at 22:01
  • i updated the example with what the data looks like and what the expected output should be. still working through it – mike01010 Nov 08 '22 at 22:58

2 Answers2

1

You can group by [year, month, day_of_week] and do a cumcount to assign to each row the number of times its day_of_week has appeared in this month.

Then, grab the rows corresponding to the first monday of the month using the filter day_of_week == 0 & cumcount == 0 and shift their index by 4 days to get the following Fridays. Finally, we do an intersection to filter out shifted indexes that do not exist in the original frame.

wanted_indices = df.index[df.groupby([df.index.year, df.index.month, df['dow']]).cumcount().eq(0) & df['dow'].eq(0)].shift(4, 'D')
df.loc[wanted_indices.intersection(df.index)]

Result on your example dataframe

            close       dow
date        
2022-07-15  3863.159912 4
2022-08-05  4145.189941 4
2022-09-16  3873.330078 4
2022-10-07  3639.659912 4
bui
  • 1,576
  • 1
  • 7
  • 10
0

I feel like the simplest way to do that is to recognize that the Friday after the first Monday of each month must be no less than day 5 (Monday must be at least day 1). So, just first filter by that requirement and then filter by day_of_week.

df_filtered = df[df.index.dt.day >= 5]
first_fridays = df[df['day_of_week'] == 5]
supersquires
  • 141
  • 3
  • i updated the example with what the data looks like and what the expected output should be. still working through it – mike01010 Nov 08 '22 at 22:58