1

I saw this, and this questions and I'd like to have the same effect, only efficiently done with itertool.izip.

From itertool.izip's documentation:

Like zip() except that it returns an iterator instead of a list

I need an iterator because I can't fit all values to memory so instead I'm using a generator and iterating over the values.

More specifically, I have a generator that generates a three values tuple, and instead of iterating it I'd like to feed three lists of values to three functions, each list represents a single position in the tuple.

Out of those three-tuple-values, only one is has big items (memory consumption wise) in it (lets call it data) while the other two contain only values that require only little amount of memory to hold, so iterating over the data value's "list of values" first should work for me by consuming the data values one by one, and caching the small ones.

I can't think of a smart way to generate one "list of values" at a time, because I might decide to remove instances of a three-value-tuple occasionally, depending on the big value of the tuple.

Using the widely suggested zip solution, similar to:

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

Results in the "unpacking argument list" part (*[...]) of this to trigger a full iteration over the entire iterator and (I assume) cache all results in memory, which is as I said, an issue for me.

I can build a mask list (True/False for small values to keep), but I'm looking for a cleaner more pythonic way. If all else fails, I'll do that.

Community
  • 1
  • 1
NirIzr
  • 3,131
  • 2
  • 30
  • 49
  • So the 'data' values might go into the third entry in the 3-tuple, and the shorter lists into the first and second, yes? Can you describe what is supposed to happen when those shorter lists are exhausted? Preferably give an explicit example using two very short lists and a slightly longer one. – DisappointedByUnaccountableMod Oct 20 '16 at 23:14
  • @barny The lists themselves all have the same length, I was talking about the size (i.e memory requirements) of individual values in the lists. Data values are too big to hold in memory together. – NirIzr Oct 20 '16 at 23:19
  • Phrasing was indeed unclear, edited. – NirIzr Oct 20 '16 at 23:21
  • so you want to [filter](https://docs.python.org/3/library/functions.html#filter) your data? – Copperfield Oct 20 '16 at 23:42
  • No, I cannot use a filter because I also need to actually process `data` (which must be an iterator). I can obviously iterate twice but one iteration takes about an hour now and I'll soon double the number of items. – NirIzr Oct 21 '16 at 01:42

1 Answers1

2

What's wrong with a traditional loop?

>>> def gen():
...     yield 'first', 0, 1
...     yield 'second', 2, 3
...     yield 'third', 4, 5
...
>>> numbers = []
>>> for data, num1, num2 in gen():
...     print data
...     numbers.append((num1, num2))
...
first
second
third
>>> numbers
[(0, 1), (2, 3), (4, 5)]
TigerhawkT3
  • 48,464
  • 6
  • 60
  • 97