6

I have a generator that I want to iterate through at two levels. The first level is unevenly spaced, then I want to chunk the next level into groups of 5, say. I need it to be memory efficient and work on generator inputs, so I'm doing something like the following. I have to think there may be a better way? In particular, I don't want the trailing Nones in the uneven length results.

import itertools

def dynamic_grouper(iterable, intervals):
    for i in intervals:
        inner_iter = list(itertools.islice(iterable, i)) # this is a "group"
        yield inner_iter

iterable = iter(xrange(100))
chunk_sizes = [22,30,38,10]

for i,group in enumerate(dynamic_grouper(iterable, chunk_sizes)):
    args = [iter(group)] * 5
    for item in itertools.izip_longest(fillvalue=None, *args):
        print "Group %i" % i
        print "Items %s" % list(item)
jseabold
  • 7,903
  • 2
  • 39
  • 53

1 Answers1

5

To avoid the Nones, you could use chunks:

def chunks(seq, n):
    # https://stackoverflow.com/a/312464/190597 (Ned Batchelder)
    """ Yield successive n-sized chunks from seq."""
    for i in xrange(0, len(seq), n):
        yield seq[i:i + n]

for i,group in enumerate(dynamic_grouper(iterable, chunk_sizes)):
    for item in chunks(group, 5):
        print "Group %i" % i
        print "Items %s" % list(item)
Community
  • 1
  • 1
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • What effect does the `len(l)` have on `l` if it is a generator? Won't that force a full read of `l`, e.g. `chunks(file.readlines(), 2)`? – Harvey Jan 05 '14 at 02:46
  • `l` is a list, not a generator. Applying `len` to a generator would raise a TypeError. Note that `dynamic_grouper` is yielding lists, so above, `group` is a list. So calling `chunks(group, 5)` is okay. – unutbu Jan 05 '14 at 02:51