I have some code that takes the Cartesian product of a list of lists of tuples, and then maps and casts the resulting iterator back to list for use by a subsequent function:
# Take the Cartesian product of a list of lists of tuples
groups = itertools.product(*list_of_lists_of_tuples)
# Mapping and casting to list is necessary to put in the correct format for a subsequent function
groups_list = list(map(list, groups))
This all works just fine in the abstract, but leads to a memory error when dealing with massive list sizes. It looks like itertools.product
is already a generator; the memory bottleneck appears to be mapping and recasting. I was thinking that I might be able to get around this problem by splitting into chunks. Now the general question of how one splits a Python iterator into chunks has been asked many times on this site, and there appear to be many good answers, including but not limited to:
What is the most "pythonic" way to iterate over a list in chunks?
Python generator that groups another iterable into groups of N
Iterate an iterator by chunks (of n) in Python?
...but I think there must be some embarrassing flaw in how I'm understanding iterables and generators to begin with, because I can't seem to get any of them to work. For example, assuming a grouper function similar to what's seen in some of those other threads:
def grouper(self, it, n):
iterable = iter(it)
while True:
chunks = itertools.islice(iterable, n)
try:
first_chunk = next(chunks)
except StopIteration:
return
yield itertools.chain((first_chunk,), chunks)
...I was expecting the result to be chunks of my itertools.product
object, which I could then operate on independently:
groups = itertools.product(*list_of_lists_of_tuples)
# create chunks of the iterator that can be operated on separately and then combined back into a list
groups_list = []
for x in self.grouper(groups, 100):
some_groups_list = list(map(list, x))
groups_list.extend(some_groups_list)
I'm getting empty lists. Something's obviously wrong, and - again - I think the main problem here is a lack of understanding on my end. Any suggestions would be greatly appreciated.