0

Assume you have two pandas datetimes: from_date and end_date. I need a function that splits it into folds of n months (lets say n=3). For example:

import pandas as pd

from_date = pd.to_datetime("2020-02-15")
to_date = pd.to_datetime("2020-05-20")

should be splitted into 2 folds:

{
"1": {"from_date": 2020-02-15, "to_date": 2020-05-15}, 
"2": {"from_date": 2020-05-16, "to_date": 2020-05-20}
}

each fold needs to satisfy the condition: from_date + pd.DateOffset(months=2) >= end_date. So it is not about the number of days between start and end date.

what is the most pythonic way to do this? Is there something in pandas?

bk_
  • 751
  • 1
  • 8
  • 27

2 Answers2

0

Replace the respective print statements according to the way you wish to use the 2 dates!

According to How do I calculate the date six months from the current date using the datetime Python module? , dateutil.relativedelta can help resolve those months with and without the 31st day!

import pandas as pd
from dateutil.relativedelta import relativedelta

from_date = pd.to_datetime("2020-02-15")
to_date = pd.to_datetime("2020-05-20")
fold = 0
result = {}


while from_date+relativedelta(months=+3)<to_date:
    curfrom = from_date #retain current 'from_date'
    from_date =from_date+relativedelta(months=+3)
    result[fold] = {"from_date": curfrom, "to_date": from_date}
    fold = fold+1
    from_date = from_date + relativedelta(days=+1) #So that the next 'from_date' starts 1 day after
result[fold] = {"from_date": curfrom, "to_date": to_date}

print(result)
user12758604
  • 375
  • 1
  • 9
  • As far as I see it, `dateutil.relativedata` yields the same results as `pd.DateOffset`. My Questions was whether there is a more pythonic way (without the while loop). In that aspect, your solution is similar to mine, for me both are not optimal (and yours has an additional import) – bk_ Mar 12 '20 at 09:45
  • I can't think of alternative to this by not using some form of iteration because it is kind of due to the nature of your input data (`from_date` and `to_date`) that you must do some form of iteration to break the dates between the `from_date` to `to_date` into n-months apart. If you would like, you must populate a list of dates, then you wouldn't have to use any loop, i.e. `from_date = ['2020-02-15', '2020-05-15']`, then you can use something like `for x in from_date`, but I do not think that it is that ideal! – user12758604 Mar 13 '20 at 08:24
0

My solution:

import pandas as pd

def check_and_split(from_date, to_date):
    from_date = pd.to_datetime(from_date)
    to_date = pd.to_datetime(to_date)

    done = False
    fold = 0
    result = {}

    start = from_date
    end = to_date

    while not done:
        if start + pd.DateOffset(months=2) > to_date:
            done = True
            end = to_date
        else:
            end = start + pd.DateOffset(months=3)

        result[fold] = {"from_date": start, "to_date": end}

        if not done:
            start = end + pd.DateOffset(days=1)
            fold += 1

    return result

Isn't there a more pythonic way? Something in pandas maybe?

bk_
  • 751
  • 1
  • 8
  • 27
  • I have provided an alternative solution. Please let me know what it might be lacking so that I could improve on the answer. Cheers! – user12758604 Mar 11 '20 at 16:01