4

My problem:

I'm trying to merge two dictionaries of lists into a new dictionary, alternating the elements of the 2 original lists for each key to create the new list for that key.

So for example, if I have two dictionaries:

strings = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}

Ns = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}

I want to merge these two dictionaries so that the final dictionary will look like:

strings_and_Ns = {'S1': ["string0", "N0", "string1", "N1", "string2"], 'S2': ["string0", "N0", "string1"]}

or better yet, have the strings from the list joined together for every key, like:

strings_and_Ns = {'S1': ["string0N0string1N1string2"], 'S2': ["string0N0string1"]}

(I'm trying to connect together DNA sequence fragments.)

What I've tried so far:

zip

 for S in Ns:   
     newsequence = [zip(strings[S], Ns[S])]
     newsequence_joined = ''.join(str(newsequence))
     strings_and_Ns[species] = newsequence_joined

This does not join the sequences together into a single string, and the order of the strings are still incorrect.

Using a defaultdict

from collections import defaultdict
strings_and_Ns = defaultdict(list)

    for S in (strings, Ns):
        for key, value in S.iteritems():
        strings_and_Ns[key].append(value)

The order of the strings for this is also incorrect...

Somehow moving along the lists for each key...

for S in strings: 
    list = strings[S]
    L = len(list)
    for i in range(L):
        strings_and_Ns[S] = strings_and_Ns[S] + strings[S][i] + strings[S][i]
5gon12eder
  • 24,280
  • 5
  • 45
  • 92
GGVan
  • 43
  • 4
  • How exactly to you want to interweave the lists if they have different lengths? There is more than one way I can think of. – 5gon12eder Oct 14 '14 at 23:27
  • The new list should always start with strings and end with strings. The len(Ns) is always len(strings)-1. – GGVan Oct 15 '14 at 16:58

5 Answers5

3
strings_and_Ns = {}
for k,v in strings.items():
    pairs = zip(v, Ns[k] + ['']) # add empty to avoid need for zip_longest()
    flat = (item for sub in pairs for item in sub)
    strings_and_Ns[k] = ''.join(flat)

flat is built according to the accepted answer here: Making a flat list out of list of lists in Python

Community
  • 1
  • 1
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
2

You could do it with itertools or with list slicing stated here. The result looks pretty smart with itertools.

strings_and_Ns = {}
for skey, sval in strings.iteritems():
    iters = [iter(sval), iter(Ns[skey])]
    strings_and_Ns[skey] = ["".join(it.next() for it in itertools.cycle(iters))]

You have to take care about the corresponding length of your lists. If one iterator raise StopIteration the merging ends for that key.

Community
  • 1
  • 1
wenzul
  • 3,948
  • 2
  • 21
  • 33
2

To alternate x, y iterables inserting default for missing values:

from itertools import izip_longest

def alternate(x, y, default):
    return (item for pair in izip_longest(x, y, default) for item in pair)

Example

a = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}
b = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}
assert a.keys() == b.keys()
merged = {k: ''.join(alternate(a[k], b[k], '')) for k in a}
print(merged)

Output

{'S2': 'string0N0string1', 'S1': 'string0N0string1N1string2'}
jfs
  • 399,953
  • 195
  • 994
  • 1,670
1

itertools.izip_longest will take care of the uneven length lists, then just use str.join to join into one single string.

strings = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}

Ns = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}

from itertools import izip_longest as iz

strings_and_Ns = {k:["".join([a+b for a, b in iz(strings[k],v,fillvalue="")])] for k,v in Ns.items()}

print(strings_and_Ns)
{'S2': ['string0N0string1'], 'S1': ['string0N0string1N1string2']}

Which is the same as:

strings_and_Ns  = {}
for k, v in Ns.items():
     strings_and_Ns[k] = ["".join([a + b for a, b in iz(strings[k], v, fillvalue="")])]

Using izip_longest means the code will work no matter which dict's values contain more elements.

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • 1
    Ok, you are editing all the time. I was confused. I think this version is faster but less readable. – wenzul Oct 14 '14 at 23:49
1

Similar to the other solutions posted, but I would move some of it off into a function

import itertools   

def alternate(*iters, **kwargs):
    return itertools.chain(*itertools.izip_longest(*iters, **kwargs))

result = {k: ''.join(alternate(strings[k], Ns[k] + [''])) for k in Ns}
print result

Gives:

{'S2': 'string0N0string1', 'S1': 'string0N0string1N1string2'}

The alternate function is from https://stackoverflow.com/a/2017923/66349. It takes iterables as arguments and chains together items from each one successively (using izip_longest as Padraic Cunningham did).

You can either specify fillvalue='' to handle the different length lists, or just manually pad out the shorter list as I have done above (which assumes Ns will always be one shorter than strings).

If you have an older python version that doesn't support dict comprehension, you could use this instead

result = dict((k, ''.join(alternate(strings[k], Ns[k] + ['']))) for k in Ns)
Community
  • 1
  • 1
Peter Gibson
  • 19,086
  • 7
  • 60
  • 64