0

I have two lists:

a = ['a', 'b', 'c', 'd']
b = ['e', 'f', 'g', 'h']

which I want to merge to one list which contains element nr. 1 of list a as first element, element nr.1 of list b as second element, element nr. 2 of list a as third element and so on, looking like this:

c = ['a', 'e', 'b', 'f', 'c', 'g', 'd', 'h']

What is the easiest way to do so, possibly without using loops?

ayhan
  • 70,170
  • 20
  • 182
  • 203
  • 1
    Because question has been closed I won't be able to post my benchmark but the fastest answer so far is this `list(sum(zip(a, b), ()))` – BPL Aug 17 '16 at 12:30
  • @BPL Looking forward to your benchmarks, requesting to reopen the question – awesoon Aug 17 '16 at 12:35
  • @soon Thanks! sometimes admins should give a little bit more time, even if they are duplicated questions... You always can find some guys posting new interesting data adding value to the question ;) – BPL Aug 17 '16 at 12:36
  • @soon Ok, I've had the chance to post my little research ;) . Btw, where is it the button to request reopning questions? I didn't know that was even possible (newbie on SO here) – BPL Aug 17 '16 at 12:45
  • @BPL You'll be able to cast close and reopen votes after reaching 3k rep. [Docs](http://stackoverflow.com/help/privileges/close-questions) – awesoon Aug 17 '16 at 12:46
  • @soon Thx! good to know, should have started posting on SO few years ago, only 1 month in the community :) – BPL Aug 17 '16 at 12:49

2 Answers2

6

Just zip them into pairs and then flatten the list using itertools.chain.from_iterable:

In [1]: a=['a','b','c','d']

In [2]: b=['e','f','g','h']

In [3]: from itertools import chain

In [4]: chain.from_iterable(zip(a, b))
Out[4]: <itertools.chain at 0x7fbcf2335ef0>

In [5]: list(chain.from_iterable(zip(a, b)))
Out[5]: ['a', 'e', 'b', 'f', 'c', 'g', 'd', 'h']
awesoon
  • 32,469
  • 11
  • 74
  • 99
2

Here's an answer comparing some of the possible methods with 2 differents datasets, one will consist of many little arrays, the other one will be few large arrays:

import timeit import random from itertools import chain

def f1(a, b):
    return list(chain.from_iterable(zip(a, b)))


def f2(a, b):
    return list(sum(zip(a, b), ()))


def f3(a, b):
    result = []
    for (e1, e2) in zip(a, b):
        result += [e1, e2]

    return result


def f4(a, b):
    result = []
    len_result = min(len(a), len(b))

    result = []
    i = 0
    while i < len_result:
        result.append(a[i])
        result.append(b[i])
        i += 1

    return result

# Small benchmark
N = 5000000
a_small = ['a', 'b', 'c', 'd']
b_small = ['e', 'f', 'g', 'h']
benchmark1 = [
    timeit.timeit(
        'f1(a_small, b_small)', setup='from __main__ import f1, a_small,b_small', number=N),
    timeit.timeit(
        'f2(a_small, b_small)', setup='from __main__ import f2, a_small,b_small', number=N),
    timeit.timeit(
        'f3(a_small, b_small)', setup='from __main__ import f3, a_small,b_small', number=N),
    timeit.timeit(
        'f4(a_small, b_small)', setup='from __main__ import f4, a_small,b_small', number=N)
]

for index, value in enumerate(benchmark1):
    print " - Small sample with {0} elements -> f{1}={2}".format(len(a_small), index + 1, value)

# Large benchmark
N = 5000
K = 100000
P = 1000
a_large = random.sample(range(K), P)
b_large = random.sample(range(K), P)
benchmark2 = [
    timeit.timeit(
        'f1(a_large, b_large)', setup='from __main__ import f1, a_large,b_large', number=N),
    timeit.timeit(
        'f2(a_large, b_large)', setup='from __main__ import f2, a_large,b_large', number=N),
    timeit.timeit(
        'f3(a_large, b_large)', setup='from __main__ import f3, a_large,b_large', number=N),
    timeit.timeit(
        'f4(a_large, b_large)', setup='from __main__ import f4, a_large,b_large', number=N)
]

for index, value in enumerate(benchmark2):
    print " - Large sample with {0} elements -> f{1}={2}".format(K, index + 1, value)
  • Small sample with 4 elements -> f1=7.50175959666
  • Small sample with 4 elements -> f2=5.52386084127
  • Small sample with 4 elements -> f3=7.12457549607
  • Small sample with 4 elements -> f4=7.24530968309
  • Large sample with 100000 elements -> f1=0.512278885906
  • Large sample with 100000 elements -> f2=28.0679210232
  • Large sample with 100000 elements -> f3=1.05977378475
  • Large sample with 100000 elements -> f4=1.17144886156

Conclusion: It seems f2 function is the slightly faster method when N is big and the lists are litte. When the arrays are large and the number is little, f1 is the winner though.

Specs: Python2.7.11(64) , N=5000000 on a i-7 2.6Ghz

BPL
  • 9,632
  • 9
  • 59
  • 117