0

shouldn't the below example include '2022-01-01'?

>>> import pandas as pd
>>> pd.date_range("2022-01-03", "2023-12-31", freq='MS')
DatetimeIndex(['2022-02-01', '2022-03-01', '2022-04-01', '2022-05-01',
               '2022-06-01', '2022-07-01', '2022-08-01', '2022-09-01',
               '2022-10-01', '2022-11-01', '2022-12-01', '2023-01-01',
               '2023-02-01', '2023-03-01', '2023-04-01', '2023-05-01',
               '2023-06-01', '2023-07-01', '2023-08-01', '2023-09-01',
               '2023-10-01', '2023-11-01', '2023-12-01'],
              dtype='datetime64[ns]', freq='MS')
user3327034
  • 395
  • 3
  • 13
  • 2
    No. it gives the expected result. 2022-01-01 is not in the range [2022-01-03, 2023-12-31]. In this case why not 2021-12-01? – Corralien Mar 01 '23 at 09:15
  • thanks for clarifying. how do we get start of the month in that case? inclusive of 2022-01-01 I meant. – user3327034 Mar 01 '23 at 09:16
  • @jezrael. Are you sure the answer matches the current problem? – Corralien Mar 01 '23 at 10:17
  • not necessarily. But I was unblocked. Essentially, my understanding with `freq=MS` was that it would give me the start of the month - when the respective start_date is defined. Even if start_date is 2022-01-03, it will say start of month is 2022-01-01. Its like `START_OF_MONTH(date)` sort of function in SQL. But looks like my understanding was wrong and/or pandas `MS` is error or misleading. – user3327034 Mar 01 '23 at 17:07

1 Answers1

1

How do we get start of the month in that case? inclusive of 2022-01-01

In this case, use pd.offsets.MonthBegin:

>>> pd.date_range("2022-01-03", "2023-12-31", freq='M') + pd.offsets.MonthBegin(-1)
DatetimeIndex(['2022-01-01', '2022-02-01', '2022-03-01', '2022-04-01',
               '2022-05-01', '2022-06-01', '2022-07-01', '2022-08-01',
               '2022-09-01', '2022-10-01', '2022-11-01', '2022-12-01',
               '2023-01-01', '2023-02-01', '2023-03-01', '2023-04-01',
               '2023-05-01', '2023-06-01', '2023-07-01', '2023-08-01',
               '2023-09-01', '2023-10-01', '2023-11-01', '2023-12-01'],
              dtype='datetime64[ns]', freq=None)
Corralien
  • 109,409
  • 8
  • 28
  • 52