The itertools
recipes have a general-purpose function that does exactly what you're looking for with any kind of iterator, called grouper
:
>>> values = ['r', 'o', 'c', 'o', 'c', 'o']
>>> groups = grouper(values, 3)
However, this returns you an iterator. If you want a list, you have to ask for one explicitly:
>>> groups = list(grouper(values, 3))
>>> print(groups)
[('r', 'o', 'c'), ('o', 'c', 'o')]
Also, note that this gives you a list of tuples, not a list of lists. Most likely this doesn't actually matter to you. But if it does, you'll have to convert them:
>>> list_groups = [list(group) for group in grouper(values, 3)]
>>> print(list_groups)
[['r', 'o', 'c'], ['o', 'c', 'o']]
If you install more_itertools
off PyPI, you can just from more_itertools import grouper
. Otherwise, you'll have to copy and paste from the recipes into your code.
But either way, it's worth understanding how grouper
works:
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return zip_longest(fillvalue=fillvalue, *args)
First, it creates an iterator out of your iterable. (This is something that keeps track of its current position and returns values one by one as you call next
on it, until you've reached the end.) Then it makes n
references to that iterator. This is the tricky bit—you don't want n
separate iterators to the same list, you want n
references to the same iterator, so if you grab the next
value out of the first iterator, they all move forward. That's why it does the funny [iter(iterable)] * n
bit. Then it just zip
s the iterators together. So, the first pass through the zip
calls next
on the first iterator, then the second, then the third; the second pass through the zip
again calls next
on the first iterator, then the second, then the third; and so on.
The reason it uses zip_longest
instead of just zip
(or, in Python 2.x, izip_longest
vs. izip
) is so list(grouper(['r', 'o', 'c', 'o'], 3))
will give you [('r', 'o', 'c'), ('o', None, None)]
instead of just [('r', 'o', 'c')]
. If that's not what you want, it's trivial to just use the other function instead.
For further explanation, see this blog post.