5

Say I have two lists one longer than the other, x = [1,2,3,4,5,6,7,8] and y = [a,b,c] and I want to merge each element in y to every 3rd index in x so the resulting list z would look like: z = [1,2,a,3,4,b,5,6,c,7,8]

What would be the best way of going about this in python?

KingFu
  • 1,358
  • 5
  • 22
  • 42

8 Answers8

5

Here is an adapted version of the roundrobin recipe from the itertools documentation that should do what you want:

from itertools import cycle, islice

def merge(a, b, pos):
    "merge('ABCDEF', [1,2,3], 3) --> A B 1 C D 2 E F 3"
    iterables = [iter(a)]*(pos-1) + [iter(b)]
    pending = len(iterables)
    nexts = cycle(iter(it).next for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))

Example:

>>> list(merge(xrange(1, 9), 'abc', 3))   # note that this works for any iterable!
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]

Or here is how you could use roundrobin() as it is without any modifications:

>>> x = [1,2,3,4,5,6,7,8]
>>> y = ['a','b','c']
>>> list(roundrobin(*([iter(x)]*2 + [y])))
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]

Or an equivalent but slightly more readable version:

>>> xiter = iter(x)
>>> list(roundrobin(xiter, xiter, y))
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]

Note that both of these methods work with any iterable, not just sequences.

Here is the original roundrobin() implementation:

from itertools import cycle, islice

def roundrobin(*iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis
    pending = len(iterables)
    nexts = cycle(iter(it).next for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))
Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
3
>>> from itertools import chain
def solve(x,y):                                                             
    it = iter(y)
    for i in xrange(0, len(x), 2):
        try:
            yield x[i:i+2] + [next(it)]
        except StopIteration:    
            yield x[i:]
...

>>> x = [1,2,3,4,5,6,7,8]
>>> y = ['a','b','c']

>>> list(chain.from_iterable(solve(x,y)))
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
3

This approach modifies x in place. Alternatively, you could make a copy of x and return the modified copy if you didn't want to change the original.

def merge(x, y, offset):
    for i, element in enumerate(y, 1):
        x.insert(i * offset - 1, element)

>>> x = [1,2,3,4,5,6,7,8]
>>> y = ['a','b','c']
>>> merge(x, y, 3)
>>> x
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]

All extra elements of y past the end of x just get appended to the end.

Matthew Jacobs
  • 4,344
  • 1
  • 20
  • 20
  • I like that this is simple and concise and I can understand it! I'm not able to comment on the performance of any of the solutions as that's above my understanding at the moment. – KingFu Jun 21 '13 at 19:07
  • 1
    [Insertion is more expensive than extension or appending](http://stackoverflow.com/questions/7776938/python-insert-vs-append). – 2rs2ts Jun 21 '13 at 19:08
  • @2rs2ts well that complicates things then! Hmm I might wait and let more experienced people vote for the best(quickest?) solution before I select an answer then – KingFu Jun 21 '13 at 19:10
  • @2rs2ts Agreed, insertion is [O(n)](http://wiki.python.org/moin/TimeComplexity). For small data sets, my approach works but I would not advise it for large sets. – Matthew Jacobs Jun 21 '13 at 19:12
  • @MatthewJacobs what would you consider a large data set? – KingFu Jun 21 '13 at 19:13
  • 1
    I guess that really depends on where this is used. If it's in the middle of loop that called over and over again, you'd want just about the fastest method you could. If, however, it's called once (perhaps occasionally) raw speed may not be tremendously important. To put a number to it, maybe a few hundred. However, I consider where it's used more important. – Matthew Jacobs Jun 21 '13 at 19:23
3

Here's another way:

x = range(1, 9)
y = list('abc')

from itertools import count, izip
from operator import itemgetter
from heapq import merge

print map(itemgetter(1), merge(enumerate(x), izip(count(1, 2), y)))
# [1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]

This keeps it all lazy before building the new list, and lets merge naturally merge the sequences... kind of a decorate/undecorate... It does require Python 2.7 for count to have a step argument though.

So, to walk it through a bit:

a = list(enumerate(x))
# [(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8)]
b = zip(count(1, 2), y)
# [(1, 'a'), (3, 'b'), (5, 'c')]
print list(merge(a, b))
# [(0, 1), (1, 2), (1, 'a'), (2, 3), (3, 4), (3, 'b'), (4, 5), (5, 6), (5, 'c'), (6, 7), (7, 8)]

Then the itemgetter(1) just takes the actual value removing the index...

Jon Clements
  • 138,671
  • 33
  • 247
  • 280
0

The above solutions are really cool. Here's an alternative that doesn't involve roundrobin or itertools.

def merge(x, y):
    result = []
    while y:
        for i in range(0, 2): result.append(x.pop(0))
        for i in range(0, 1): result.append(y.pop(0))
    result.extend(x)
    return result

where 2 and 1 are arbitrary and list y is assumed to be shorter than list x.

Eli Rose
  • 6,788
  • 8
  • 35
  • 55
  • This is similar to my own implementation except I used one loop and a conditional. Although I suspect yours might be better performance wise – KingFu Jun 21 '13 at 19:04
0
sep, lst = 2, []
for i in range(len(y)+1):
    lst += x[i*sep:(i+1)*sep] + y[i:i+1]

Where sep is the number of elements of x before an element of y is inserted.

Performance:

>>> timeit.timeit(stmt="for i in range(len(y)+1): lst += x[i*sep:(i+1)*sep] + y[i:i+1]", setup="lst = [];x = [1,2,3,4,5,6,7,8];y = ['a','b','c'];sep = 2", number=1000000)
2.8565280437469482

Pretty damn good. I wasn't able to get the stmt to begin with let = [] so I think it kept appending to lst (unless I misunderstand timeit), but still... pretty good for a million times.

2rs2ts
  • 10,662
  • 10
  • 51
  • 95
0

Using itertools.izip_longest:

>>> from itertools import izip_longest, chain
>>> y = ['a','b','c']
>>> x = [1,2,3,4,5,6,7,8]   
>>> lis = (x[i:i+2] for i in xrange(0, len(x) ,2)) # generator expression
>>> list(chain.from_iterable([ (a + [b]) if b else a  
                                            for a, b in izip_longest(lis, y)]))
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
0
def merge(xs, ys):
    ys = iter(ys)
    for i, x in enumerate(xs, 1):
        yield x
        if i % 2 == 0:
            yield next(ys)

''.join(merge('12345678', 'abc')) # => '12a34b56c78'
falsetru
  • 357,413
  • 63
  • 732
  • 636