I have a function which resembles:
def long_running_with_more_values(start, stop):
headers = get_headers.delay(start, stop)
insert_to_db.delay(headers)
This function is batch processing data that is requested from the net in parallel. get_headers + insert_to_db is firing off to the message stack and is processed in the end by celery workers, so is not blocking execution.
It has to process every number between start and stop, but can split this up into sections (ranges).
I've found that the operation get_headers is optimal when the range is ~20000 where range = (stop - start)
I want to know how I can split an arbitrary range into groups of 20000 and run each group through the function so I end up with the function being called multiple times with different start and stop values, but still covering the previous range in total.
so for starting values for start and stop of 1 and 100000 respectively i'd expect get_headers to be called 5 times with the following:
[1,20000][20001,40000][40001,60000][60001,80000][80001,100000]