1

I have a df like this:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(100, 2),columns=['A', 'B'])
df['Date'] = pd.date_range("1/1/2000", periods=100)
df.set_index('Date', inplace = True)

I want to resample it by week and get the last value, but when I use this statement it returns results that actually include seven days of the week.

>>> df.resample('W').last()
                   A         B
Date                          
2000-01-02 -0.233055  0.215712
2000-01-09 -0.031690 -1.194929
2000-01-16 -1.441544 -0.206924
2000-01-23 -0.225403 -0.058323
2000-01-30 -1.564966 -1.409034
2000-02-06  0.800451 -0.730578
2000-02-13 -0.265631 -0.161049
2000-02-20  0.252658 -0.458502
2000-02-27  1.982499  3.208221
2000-03-05 -0.391827  0.927733
2000-03-12 -0.723863 -0.076955
2000-03-19 -1.379905  0.259892
2000-03-26 -0.983180  1.734662
2000-04-02  0.139668 -0.834987
2000-04-09  0.854117 -0.421875

And I only want the results for 5 days a week(not including Saturday and Sunday),That is, the returned date interval should be 5 but 7. Can pandas implement this? Or do I have to resort to some 3rd party calendar library?

  • Try having a look at [this](https://stackoverflow.com/questions/44770839/resampling-a-pandas-dataframe-by-business-days-gives-bad-results) answer. Apparently you could try with: `df = df[df['date_time'].apply(lambda x: x.weekday() not in [5,6])]` – Yolao_21 Aug 17 '22 at 11:52
  • take a look at this [post](https://stackoverflow.com/questions/45281297/group-by-week-in-pandas). The answers have situations similar to yours – 99_m4n Aug 17 '22 at 11:58
  • @Yolao_21 Thanks for your comment, it works for my dataframe, but when I resample the results I get are still 7 day intervals. – skywave1980 Aug 17 '22 at 12:06
  • @99_m4n Thank you, I haven't solved the problem yet, but I found a dt.isocalendar() on the post you provided, I'll try it out to see if it does what I want. – skywave1980 Aug 17 '22 at 12:09
  • Sorry, I understand now. The interval of resample cannot be changed, as long as my df is calculated according to 5 days, there is no problem. That's enough for me, thank you both. – skywave1980 Aug 17 '22 at 12:17

1 Answers1

1

To select only the weekdays you can use:

df = df[df.index.weekday.isin(list(range(5)))]

This will give you your DataFrame only including Monday to Friday. The job afterwards can keep the same.

Comment Calling resample('W') will create the missing indexes. I belive your want to do something else.

mosc9575
  • 5,618
  • 2
  • 9
  • 32