1

I'm not a Python developer but have to fix an existing code.

In this code, a method (extract) is called providing an interval of dates:

extract(start_date, end_date)

The parameters can have for exemple the values:

start_date : 2020-10-01
end_date   : 2022-01-03

The problem

The issue with this call is that the extract method only support a 1 year max interval of dates. If greater, the interval must be split, for exemple as follow:

extract('2020-10-01', '2020-12-31')
extract('2021-01-01', '2021-12-31')
extract('2022-01-01', '2022-01-03')

So I'm trying to create loop where the start_date and end_date are computed dynamically. But being new to Python, I have no ideas for now how this can be done. Any help would be greatly appreciated.

EDIT

Answer to some comments here Tried so far finding a solution starting from code like this so far:

from datetime import datetime
from dateutil import relativedelta
from datetime import datetime

from_date = datetime.strptime('2020-10-01', "%Y-%m-%d")
end_date  = datetime.strptime('2022-01-03', "%Y-%m-%d")

# Get the interval between the two dates
diff = relativedelta.relativedelta(end_date, from_date)

Then I thought iterating accross the years using diff.years and adding some logic to build the start_date and end_date from there, but I thought there might be a much simplier approach.

Also saw others possibilities like here but still no final simple result found at the moment.

Hey StackExchange
  • 2,057
  • 3
  • 19
  • 35
  • 1
    What have you tried so far? Since your inputs are strings, given a specific format you can parse them and obtain values for year from there, then convert it to numbers to increment while calling the `extract` function. Show us something you tried and why it didn't work. Expected result and current result is very helpful in these situations – MatBBastos Jan 07 '22 at 12:18
  • Use the datetime library of python to perform operations with dates – Fran Arenas Jan 07 '22 at 12:19
  • what do you need to be returned for the two dates in the example? – Jayvee Jan 07 '22 at 12:32
  • @Jayvee the extract method is just returning a "returncode" (success=0 or failed=-1) after the processing. What does the extract method have no impact of the solution ;) – Hey StackExchange Jan 07 '22 at 12:34

3 Answers3

1

As mentioned in the comments, you can either use the datetime library or you can also use pandas if you want. The pandas version is the following (admittively not the most pretty, but it does the job):

import pandas as pd
import datetime

start = datetime.datetime(2020,10,1)
end = datetime.datetime(2022,1,3)


def extract(from_dt, to_dt):
    print(f'Extracting from {from_dt} to {to_dt}')


prev_end = pd.to_datetime(start)
for next_end in pd.date_range(datetime.datetime(start.year, 12, 31), end, freq='y'):
    if next_end < end:
        extract(prev_end.strftime('%Y-%m-%d'), next_end.strftime('%Y-%m-%d'))
    else:
        extract(prev_end.strftime('%Y-%m-%d'), end.strftime('%Y-%m-%d'))
    prev_end = next_end + datetime.timedelta(days=1)
if prev_end < end:
    extract(prev_end.strftime('%Y-%m-%d'), end.strftime('%Y-%m-%d'))

If you need to parse the original dates from strings, check out datetime.strptime

C Hecht
  • 932
  • 5
  • 14
1
from_str = '2020-10-01'
end_str = '2022-01-03'

from_year = int(from_str[:4])
end_year = int(end_str[:4])

if from_year != end_year:
    # from_date to end of first year
    extract(from_str, f"{from_year}-12-31")

    # full years
    for y in range(from_year + 1, end_year):
        extract(f"{y}-01-01", f"{y}-12-31")

    # rest
    extract(f"{end_year}-01-01", end_str)
else:
    extract(from_str, end_str)
Wups
  • 2,489
  • 1
  • 6
  • 17
  • Sounds simple and interesting. Just guessing it will fails if there is a 1 year only interval between the 2 dates. For e.g. with `from_str = '2021-10-01'` and `end_str = '2022-01-03'` as it in the call: `extract(f"{y}-01-01", f"{y}-12-31")` the end_date will be 2022-12-31 instead of '2022-01-03', but got the idea. Many thanks ! – Hey StackExchange Jan 07 '22 at 12:58
  • @HeyStackExchange you can write a dummy extract function like `def extract(s1, s2): print(s1, s2)` and test it yourself. It works for `from_str = '2021-10-01'` and `end_str = '2022-01-03'`. The full year loop isn't entered in this case. – Wups Jan 07 '22 at 13:02
  • 1
    Apologizes if so. Will try and come back for the results. Thank you! – Hey StackExchange Jan 07 '22 at 13:05
1

This kind of problems are nice ones to resolve by recursion:

   from datetime import datetime
    
    start_date = '2020-10-01'
    end_date   = '2022-01-03'
    
    def intervalcalc(datestart,dateend):
        newdate=dateend[:4] + '-01-01'
        startd = datetime.strptime(datestart, "%Y-%m-%d")
        endd =   datetime.strptime(newdate, "%Y-%m-%d")
        if endd < startd:
            print(datestart, dateend)
            return True
        else:
            print(newdate, dateend)
            previousyear=str(int(newdate[:4])-1) + '-12-31'
            intervalcalc(datestart,previousyear)
    
    
    intervalcalc(start_date, end_date)

output:

2022-01-01 2022-01-03
2021-01-01 2021-12-31
2020-10-01 2020-12-31

You just need to change the prints by calls to extract function.

As mentioned by @Wups the conversion to date is not really necessary, it could be an string compare as they are YYYYMMDD dates.

Also, this can be done the other way around and calculate from the start date year + '-12-31' and then compare dateend>end_date to determine the anchor for the recursion.

Jayvee
  • 10,670
  • 3
  • 29
  • 40