0

I would like to create a [function that returns] a pandas series of datetime values for the Nth calendar day of each month for the current year. An added wrinkle is I would also need it to be the previous business day if it happens to fall on the weekend. Bonus would be to check against known holidays as well.

For example, I'd like the output to look like this for the [business day prior to or equal to the] 14th day of the month

0    2021-01-14
1    2021-02-12
2    2021-03-12
3    2021-04-14
4    2021-05-14
5    2021-06-14
6    2021-07-14
7    2021-08-13
8    2021-09-14
9    2021-10-14
10   2021-11-12
11   2021-12-14

I've tried using pd.date_range() and pd.bdate_range() and did not get the desired results. Example:

pd.date_range("2021-01-14","2021-12-14", periods=12)

>> DatetimeIndex(['2021-01-14 00:00:00',
           '2021-02-13 08:43:38.181818182',
           '2021-03-15 17:27:16.363636364',
           '2021-04-15 02:10:54.545454546',
           '2021-05-15 10:54:32.727272728',
           '2021-06-14 19:38:10.909090910',
           '2021-07-15 04:21:49.090909092',
           '2021-08-14 13:05:27.272727272',
           '2021-09-13 21:49:05.454545456',
           '2021-10-14 06:32:43.636363640',
           '2021-11-13 15:16:21.818181820',
                     '2021-12-14 00:00:00'],
          dtype='datetime64[ns]', freq=None)>>

Additionally this requires knowing the first and last month days that would be the start and end. Analogous tests with pd.bdate_range() resulted mostly in errors.

Josh Knechtel
  • 775
  • 1
  • 4
  • 8

2 Answers2

1

You can use the months start and then add a timedelta to get it to the day you want. So for your example it would be:

pd.date_range(start=pd.Timestamp("2020-12-14"), periods=12, freq='MS') + pd.Timedelta(days=13)

Output:

DatetimeIndex(['2021-01-14', '2021-02-14', '2021-03-14', '2021-04-14',
               '2021-05-14', '2021-06-14', '2021-07-14', '2021-08-14',
               '2021-09-14', '2021-10-14', '2021-11-14', '2021-12-14'],
              dtype='datetime64[ns]', freq=None)

to move to the previous business day use (see: Pandas offset DatetimeIndex to next business if date is not a business day and Most recent previous business day in Python) :

(pd.date_range(start=pd.Timestamp("2021-06-04"), periods=12, freq='MS') + pd.Timedelta(days=4)).map(lambda x: x - pd.tseries.offsets.BDay())

output:

DatetimeIndex(['2021-07-02', '2021-08-05', '2021-09-03', '2021-10-04',
               '2021-11-04', '2021-12-03', '2022-01-06', '2022-02-04',
               '2022-03-04', '2022-04-04', '2022-05-05', '2022-06-03'],
              dtype='datetime64[ns]', freq=None)
n4321d
  • 1,059
  • 2
  • 12
  • 31
1

Similar approach to Pandas Date Range Monthly on Specific Day of Month but subtract a Bday to get the previous buisness day. Also start at 12/31 of the previous year to get all values for the current year:

def get_date_range(day_of_month, year=pd.Timestamp.now().year):
    return (
            pd.date_range(start=pd.Timestamp(year=year - 1, month=12, day=31),
                          periods=12, freq='MS') +
            pd.Timedelta(days=day_of_month) -
            pd.tseries.offsets.BDay()
    )

Usage for year:

get_date_range(14)
DatetimeIndex(['2021-01-14', '2021-02-12', '2021-03-12', '2021-04-14',
               '2021-05-14', '2021-06-14', '2021-07-14', '2021-08-13',
               '2021-09-14', '2021-10-14', '2021-11-12', '2021-12-14'],
              dtype='datetime64[ns]', freq=None)

Or for another year:

get_date_range(14, 2020)
DatetimeIndex(['2020-01-14', '2020-02-14', '2020-03-13', '2020-04-14',
               '2020-05-14', '2020-06-12', '2020-07-14', '2020-08-14',
               '2020-09-14', '2020-10-14', '2020-11-13', '2020-12-14'],
              dtype='datetime64[ns]', freq=None)

With Holidays (this is non-vectorized so it will raise a PerformanceWarning):

import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay

bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())


def get_date_range(day_of_month, year=pd.Timestamp.now().year):
    return (
            pd.date_range(start=pd.Timestamp(year=year - 1, month=12, day=31),
                          periods=12, freq='MS') +
            pd.Timedelta(days=day_of_month) -
            bday_us
    )
get_date_range(25)
DatetimeIndex(['2021-01-25', '2021-02-25', '2021-03-25', '2021-04-23',
               '2021-05-25', '2021-06-25', '2021-07-23', '2021-08-25',
               '2021-09-24', '2021-10-25', '2021-11-24', '2021-12-23'],
              dtype='datetime64[ns]', freq=None)
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57