I need to split the total number of elements in iterator :
tot= itertools.combinations(dict1.keys(), 2)
into 3 parts.
The size of dict1 = 285056 Total combinations possible = 40billion
My goal is to somehow divide these 40billion into 3 parts of 13.5 billion elements each to process on different processors parallely. At the moment i am naively iterating the 40billion and dumping pickle files when i reach 13.5 billion which isnt efficient as each 13.5 billion pickle is 160gb on disk (much larger when loaded in memory)
So is there any way I could iterate the 40billion till 13.5billionth element in one code and then start from 13.6 billionth element in code 2 and so on without iteration like i did. Below code i use to get certain number of elements from combinations iterable.
def grouper(n, iterable):
it = iter(iterable)
while True:
chunk = tuple(itertools.islice(it, n))
if not chunk:
return
yield chunk
for first_chunk in grouper(1350000000,tot ):