122

Often enough, I've found the need to process a list by pairs. I was wondering which would be the pythonic and efficient way to do it, and found this on Google:

pairs = zip(t[::2], t[1::2])

I thought that was pythonic enough, but after a recent discussion involving idioms versus efficiency, I decided to do some tests:

import time
from itertools import islice, izip

def pairs_1(t):
    return zip(t[::2], t[1::2]) 

def pairs_2(t):
    return izip(t[::2], t[1::2]) 

def pairs_3(t):
    return izip(islice(t,None,None,2), islice(t,1,None,2))

A = range(10000)
B = xrange(len(A))

def pairs_4(t):
    # ignore value of t!
    t = B
    return izip(islice(t,None,None,2), islice(t,1,None,2))

for f in pairs_1, pairs_2, pairs_3, pairs_4:
    # time the pairing
    s = time.time()
    for i in range(1000):
        p = f(A)
    t1 = time.time() - s

    # time using the pairs
    s = time.time()
    for i in range(1000):
        p = f(A)
        for a, b in p:
            pass
    t2 = time.time() - s
    print t1, t2, t2-t1

These were the results on my computer:

1.48668909073 2.63187503815 1.14518594742
0.105381965637 1.35109519958 1.24571323395
0.00257992744446 1.46182489395 1.45924496651
0.00251388549805 1.70076990128 1.69825601578

If I'm interpreting them correctly, that should mean that the implementation of lists, list indexing, and list slicing in Python is very efficient. It's a result both comforting and unexpected.

Is there another, "better" way of traversing a list in pairs?

Note that if the list has an odd number of elements then the last one will not be in any of the pairs.

Which would be the right way to ensure that all elements are included?

I added these two suggestions from the answers to the tests:

def pairwise(t):
    it = iter(t)
    return izip(it, it)

def chunkwise(t, size=2):
    it = iter(t)
    return izip(*[it]*size)

These are the results:

0.00159502029419 1.25745987892 1.25586485863
0.00222492218018 1.23795199394 1.23572707176

Results so far

Most pythonic and very efficient:

pairs = izip(t[::2], t[1::2])

Most efficient and very pythonic:

pairs = izip(*[iter(t)]*2)

It took me a moment to grok that the first answer uses two iterators while the second uses a single one.

To deal with sequences with an odd number of elements, the suggestion has been to augment the original sequence adding one element (None) that gets paired with the previous last element, something that can be achieved with itertools.izip_longest().

Finally

Note that, in Python 3.x, zip() behaves as itertools.izip(), and itertools.izip() is gone.

Community
  • 1
  • 1
Apalala
  • 9,017
  • 3
  • 30
  • 48
  • RE: the "right way" -- there isn't a "right" way! It depends on the use case. – Andrew Jaffe Jan 07 '11 at 17:45
  • @Andrew Jaffe I gave the criteria for "best" in this case: efficient, and pythonic. – Apalala Jan 07 '11 at 18:51
  • @Apalala: I mean that the *outcome* of having an odd number depends on the use. For example: you could just leave off the last element, or add a specific known dummy element, or duplicate the last one – Andrew Jaffe Jan 07 '11 at 19:18
  • @Andrew Jaffe. The _outcome_ is not part of the question. The last/odd element can be paired, or included alone (in a one-tuple, or something else). What matters is that inclusion of the odd element doesn't break the _efficient_ nor the _pythonic_ aspects of the solution (using `izip_longest()` is a good solution). Whatever comes later must know that it will have to deal with an odd number of cases if the original list had an odd number of elements. – Apalala Jan 07 '11 at 19:28
  • How come no one (including me) noticed that the code for the timings is incorrect? – Apalala Jan 08 '11 at 21:57
  • 2
    @Apalala: because you're using some mumbo-jumbo instead of the `timeit` module. – SilentGhost Jan 08 '11 at 22:38
  • @SilentGhost Indeed, I need to get a grip on `timeit`. – Apalala Jan 09 '11 at 00:36
  • For those arriving late, I fixed the mumbo-jumbo @SilentGhost referred to. – Apalala Jan 09 '11 at 00:51
  • 1
    n-duplicated: just in a quick search: http://stackoverflow.com/questions/4501636/, http://stackoverflow.com/questions/4170295/, http://stackoverflow.com/questions/434287/ – tokland Jan 10 '11 at 13:16
  • When t has an odd number of elements, `zip(t[::2], t[1::2])` leaves at the last. – Pyderman Feb 20 '16 at 03:50
  • @Pyderman The article does mention `itertools.izip_longest()`. – Apalala Feb 20 '16 at 09:47
  • The question posted for closing this one is older, and it doesn't feature the more ideomatic and more efficient solutions in this one. – Apalala Jul 08 '22 at 20:31

10 Answers10

64

My favorite way to do it:

def pairwise(t):
    it = iter(t)
    return zip(it,it)

# for "pairs" of any length
def chunkwise(t, size=2):
    it = iter(t)
    return zip(*[it]*size)

When you want to pair all elements you obviously might need a fillvalue:

from itertools import izip_longest
def blockwise(t, size=2, fillvalue=None):
    it = iter(t)
    return izip_longest(*[it]*size, fillvalue=fillvalue)

With Python 3, itertools.izip is now simply zip .. to work with an older Python, use

from itertools import izip as zip
ti7
  • 16,375
  • 6
  • 40
  • 68
Jochen Ritzel
  • 104,512
  • 31
  • 200
  • 194
  • The first (pairwise) function seems to be missing the cloning and advancing of the second iterator. See the `itertools` recipes section. – Apalala Jan 07 '11 at 18:40
  • @Apalala: zip does advance the same iterator twice. – Jochen Ritzel Jan 07 '11 at 18:55
  • Of course, you're correct, and pairwise is the most efficient so far, I don't know why. – Apalala Jan 07 '11 at 19:05
  • 1
    I love this solution: it's lazy, and it exploits the statefulness of iterators to great effect. You could even make it a one-liner, though perhaps at the expense of readability: `izip(*[iter(t)]*size)` – Channing Moore Dec 18 '14 at 16:56
  • for your second solution, wouldn't you want to avoid creating a list if going after performance? – max Jul 24 '16 at 23:26
  • @max Using `izip(*(it,)*size)` also works, but the intention seems less clear, and there shouldn't be much improvement in performance. – Apalala Apr 25 '17 at 14:29
  • @JochenRitzel Works great, but izip does not exist in Python 3 anymore, you may want to extend your answer using itertools.zip_longest() – Cerno Feb 19 '20 at 08:44
53

I'd say that your initial solution pairs = zip(t[::2], t[1::2]) is the best one because it is easiest to read (and in Python 3, zip automatically returns an iterator instead of a list).

To ensure that all elements are included, you could simply extend the list by None.

Then, if the list has an odd number of elements, the last pair will be (item, None).

>>> t = [1,2,3,4,5]
>>> t.append(None)
>>> zip(t[::2], t[1::2])
[(1, 2), (3, 4), (5, None)]
>>> t = [1,2,3,4,5,6]
>>> t.append(None)
>>> zip(t[::2], t[1::2])
[(1, 2), (3, 4), (5, 6)]
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
6

I start with small disclaimer - don't use the code below. It's not Pythonic at all, I wrote just for fun. It's similar to @THC4k pairwise function but it uses iter and lambda closures. It doesn't use itertools module and doesn't support fillvalue. I put it here because someone might find it interesting:

pairwise = lambda t: iter((lambda f: lambda: (f(), f()))(iter(t).next), None)
Tomasz Elendt
  • 1,475
  • 13
  • 14
4

As far as most pythonic goes, I'd say the recipes supplied in the python source docs (some of which look a lot like the answers that @JochenRitzel provided) is probably your best bet ;)

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

On modern python you just have to use zip_longest(*args, fillvalue=fillvalue) according to the corresponding doc page.

Pat
  • 16,515
  • 15
  • 95
  • 114
4
>>> my_list = [1,2,3,4,5,6,7,8,9,10]
>>> my_pairs = list()
>>> while(my_list):
...     a = my_list.pop(0); b = my_list.pop(0)
...     my_pairs.append((a,b))
... 
>>> print(my_pairs)
[(1, 2), (3, 4), (5, 6), (7, 8), (9, 10)]
3

Is there another, "better" way of traversing a list in pairs?

I can't say for sure but I doubt it: Any other traversal would include more Python code which has to be interpreted. The built-in functions like zip() are written in C which is much faster.

Which would be the right way to ensure that all elements are included?

Check the length of the list and if it's odd (len(list) & 1 == 1), copy the list and append an item.

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
3

Only do it:

>>> l = [1, 2, 3, 4, 5, 6]
>>> [(x,y) for x,y in zip(l[:-1], l[1:])]
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6)]
1

Here is an example of creating pairs/legs by using a generator. Generators are free from stack limits

def pairwise(data):
    zip(data[::2], data[1::2])

Example:

print(list(pairwise(range(10))))

Output:

[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
Vlad Bezden
  • 83,883
  • 25
  • 248
  • 179
0

Just in case someone needs the answer algorithm-wise, here it is:

>>> def getPairs(list):
...     out = []
...     for i in range(len(list)-1):
...         a = list.pop(0)
...         for j in a:
...             out.append([a, j])
...     return b
>>> 
>>> k = [1, 2, 3, 4]
>>> l = getPairs(k)
>>> l
[[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]

But take note that your original list will also be reduced to its last element, because you used pop on it.

>>> k
[4]
0

This snippet worked for me. It creates pairs of tuples and adds empty string to the last pair, if the list length is odd (fillvalue="").

zip_longest(*[iter(my_list)] * 2, fillvalue="")

# odd
list(zip_longest(*[iter([0, 1, 2, 3, 4, 5, 6])] * 2, fillvalue=""))
[(0, 1), (2, 3), (4, 5), (6, '')]

# even
list(zip_longest(*[iter([0, 1, 2, 3, 4, 5])] * 2, fillvalue=""))
[(0, 1), (2, 3), (4, 5)]
SherylHohman
  • 16,580
  • 17
  • 88
  • 94
DannyG
  • 141
  • 1
  • 5