10

In Pandas, I know you can use anchor offsets to specify more complicated reucrrences: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#anchored-offset

I want to specify a date_range such that it is monthly on the nth day of each month. What is the best syntax to do that with? I'm imaginging something similar to this which specifies a recurrence every 2 weeks on Friday:

schedule = pd.date_range(start=START_STR, periods=26, freq="2W-FRI")
themaestro
  • 13,750
  • 20
  • 56
  • 75
  • Even if there is not a ready-made `pandas` function to do this, you should be able to write a simple helper function to do this `pd.date_range` and `pd.DatetimeIndex`. Take a look at [my sample script](https://www.dropbox.com/s/gt4t8232vupow79/monthly_date_range.py?dl=0). – Abdou Oct 14 '16 at 01:15

3 Answers3

11

IIUC you can do it this way:

In [18]: pd.DataFrame(pd.date_range('2016-01-01', periods=10, freq='MS') + pd.DateOffset(days=26), columns=['Date'])
Out[18]:
        Date
0 2016-01-27
1 2016-02-27
2 2016-03-27
3 2016-04-27
4 2016-05-27
5 2016-06-27
6 2016-07-27
7 2016-08-27
8 2016-09-27
9 2016-10-27

UPDATE: to account for different numbers of days of months and leap years:

def month_range(start, periods=12):
    rng = pd.date_range(pd.Timestamp(start)-pd.offsets.MonthBegin(),
                        periods=periods,
                        freq='MS')
    ret = (rng + pd.offsets.Day(pd.Timestamp(start).day-1)).to_series()
    ret.loc[ret.dt.month > rng.month] -= pd.offsets.MonthEnd(1)
    return pd.DatetimeIndex(ret)

Examples:

In [202]: month_range('2016-01-27', 12)
Out[202]:
DatetimeIndex(['2016-01-27', '2016-02-27', '2016-03-27', '2016-04-27', '2016-05-27', '2016-06-27', '2016-07-27', '2016-08-27',
               '2016-09-27', '2016-10-27', '2016-11-27', '2016-12-27'],
              dtype='datetime64[ns]', freq=None)

In [203]: month_range('2020-01-31', 12)
Out[203]:
DatetimeIndex(['2020-01-31', '2020-02-29', '2020-03-31', '2020-04-30', '2020-05-31', '2020-06-30', '2020-07-31', '2020-08-31',
               '2020-09-30', '2020-10-31', '2020-11-30', '2020-12-31'],
              dtype='datetime64[ns]', freq=None)

In [204]: month_range('2019-01-29', 12)
Out[204]:
DatetimeIndex(['2019-01-29', '2019-02-28', '2019-03-29', '2019-04-29', '2019-05-29', '2019-06-29', '2019-07-29', '2019-08-29',
               '2019-09-29', '2019-10-29', '2019-11-29', '2019-12-29'],
              dtype='datetime64[ns]', freq=None)
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
3

No need to re-invent the wheel. Use DateOffset from pandas:

import pandas as pd
from pandas.tseries.offsets import DateOffset
from datetime import date

date1 = date(2019,1,29)
pd.date_range(date1, periods=12, freq=DateOffset(months=1))

Output:

DatetimeIndex(['2019-01-29', '2019-02-28', '2019-03-28', '2019-04-28',
               '2019-05-28', '2019-06-28', '2019-07-28', '2019-08-28',
               '2019-09-28', '2019-10-28', '2019-11-28', '2019-12-28'],
              dtype='datetime64[ns]', freq='<DateOffset: months=1>')
cpage
  • 119
  • 6
  • 27
1

Edit: The accepted answer did initially not account for different numbers of days of months and leap years. Here's still an alternative function to solve the problem:

import pandas as pd

def month_range_day(start=None, periods=None):
    start_date = pd.Timestamp(start).date()
    month_range = pd.date_range(start=start_date, periods=periods, freq='M')
    month_day = month_range.day.values
    month_day[start_date.day < month_day] = start_date.day
    return pd.to_datetime(month_range.year*10000+month_range.month*100+month_day, format='%Y%m%d')

Example 1:

start_date = '2020-01-31'
month_range_day(start=start_date, periods=12)

Output:

DatetimeIndex(['2020-01-31', '2020-02-29', '2020-03-31', '2020-04-30',
               '2020-05-31', '2020-06-30', '2020-07-31', '2020-08-31',
               '2020-09-30', '2020-10-31', '2020-11-30', '2020-12-31'],
              dtype='datetime64[ns]', freq=None) 

Example 2:

start_date = '2019-01-29'
month_range_day(start=start_date, periods=12)

Output:

DatetimeIndex(['2019-01-29', '2019-02-28', '2019-03-29', '2019-04-29',
               '2019-05-29', '2019-06-29', '2019-07-29', '2019-08-29',
               '2019-09-29', '2019-10-29', '2019-11-29', '2019-12-29'],
              dtype='datetime64[ns]', freq=None)
mloning
  • 825
  • 7
  • 18