In Python, it is easy to break an n-long list into k-size chunks if n is a multiple of k (IOW, n % k == 0
). Here's my favorite approach (straight from the docs):
>>> k = 3
>>> n = 5 * k
>>> x = range(k * 5)
>>> zip(*[iter(x)] * k)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]
(The trick is that [iter(x)] * k
produces a list of k references to the same iterator, as returned by iter(x)
. Then zip
generates each chunk by calling each of the k copies of the iterator exactly once. The *
before [iter(x)] * k
is necessary because zip
expects to receive its arguments as "separate" iterators, rather than a list of them.)
The main shortcoming I see with this idiom is that, when n is not a multiple of k (IOW, n % k > 0
), the left over entries are just left out; e.g.:
>>> zip(*[iter(x)] * (k + 1))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11)]
There's an alternative idiom that is slightly longer to type, produces the same result as the one above when n % k == 0
, and has a more acceptable behavior when n % k > 0
:
>>> map(None, *[iter(x)] * k)
[(0, 1, 2), (3, 4, 5), (6, 7, 8), (9, 10, 11), (12, 13, 14)]
>>> map(None, *[iter(x)] * (k + 1))
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14, None)]
At least, here the left over entries are retained, but the last chunk gets padded with None
. If one just wants a different value for the padding, then itertools.izip_longest
solves the problem.
But suppose the desired solution is one in which the last chunk is left unpadded, i.e.
[(0, 1, 2, 3), (4, 5, 6, 7), (8, 9, 10, 11), (12, 13, 14)]
Is there a simple way to modify the map(None, *[iter(x)]*k)
idiom to produce this result?
(Granted, it is not difficult to solve this problem by writing a function (see, for example, the many fine replies to How do you split a list into evenly sized chunks? or What is the most "pythonic" way to iterate over a list in chunks?). Therefore, a more accurate title for this question would be "How to salvage the map(None, *[iter(x)]*k)
idiom?", but I think it would baffle a lot of readers.)
I was struck by how easy it is to break a list into even-sized chunks, and how difficult (in comparison!) it is to get rid of the unwanted padding, even though the two problems seem of comparable complexity.