2

I am trying to get similar output to df.index.day but with the number corresponding to the business day. So for 2020-04-06 I want 4 instead of 6.

I appreciate the help.

Solution:

for i in range(len(df_21)):
    if ((i == 0) | (df_21.month[i-1] != df_21.month[i])):
        df_21.day[i] = 1
    elif ((df_21.day[i] - df_21.day[i-1]) != 1):
        df_21.day[i] = df_21.day[i-1] + 1
dejanualex
  • 3,872
  • 6
  • 22
  • 37
Quantoisseur
  • 131
  • 5
  • what about using [weekday](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DatetimeIndex.weekday.html)? – FObersteiner Apr 28 '20 at 15:28
  • I want a monthly running count of weekday. I know I could hack it together, just wanted to see if there was an easy way. – Quantoisseur Apr 28 '20 at 15:37
  • so to clarify: you want to have a column showing the cumulative business days up to each row? does your datetime index also include non-business days? if so, how do you want to handle those? – FObersteiner Apr 28 '20 at 15:42
  • 1
    Is this 4th of june or 6th of april? And if it is 4th of june, then why would you get `6` returned? 4th of june is a thursday – Erfan Apr 28 '20 at 15:42
  • @Erfan: the 6th of april is the 4th business day of april, if you sum up all business days up to and including that date – FObersteiner Apr 28 '20 at 15:47
  • April 6th. Yes, cumulative business days up to each row, reset each month. datetimeindex is only business days. – Quantoisseur Apr 28 '20 at 15:50
  • I just posted what I did which looks like it works. – Quantoisseur Apr 28 '20 at 15:56
  • related to [this question](https://stackoverflow.com/questions/46903848/find-the-business-days-between-two-columns-in-a-pandas-dataframe-which-contain). – Quang Hoang Apr 28 '20 at 16:04

1 Answers1

0

Assuming you have a datetime column like the following, we can use pandas.bdate_range to check which dates are business days, then use Series.cumsum to get the cumulative sum:

df = pd.DataFrame({'date': pd.date_range('2020-04-01', '2020-04-07')})

        date
0 2020-04-01
1 2020-04-02
2 2020-04-03
3 2020-04-04
4 2020-04-05
5 2020-04-06
6 2020-04-07

Solution

# get date range of business days
business_days = pd.bdate_range(df['date'].min(), df['date'].max())

# check which dates are business days
bdays_bool = df['date'].isin(business_days)

# get the cumulative sums of all business days per month
df['nth_business_day'] = bdays_bool.groupby([df['date'].dt.year, df['date'].dt.month]).cumsum()

        date  nth_business_day
0 2020-03-29               0.0
1 2020-03-30               1.0
2 2020-03-31               2.0
3 2020-04-01               1.0
4 2020-04-02               2.0
5 2020-04-03               3.0
6 2020-04-04               3.0
7 2020-04-05               3.0
8 2020-04-06               4.0
9 2020-04-07               5.0
Erfan
  • 40,971
  • 8
  • 66
  • 78