Left join two lists of dictionaries on a single key

Question

Given n lists with m dictionaries as their elements, I would like to produce a new list, with a joined (left join) set of dictionaries. Each dictionary is guaranteed to have a key called index, but could have an arbitrary set of keys beyond that. For example, imagine the following two lists:

l1 = [{"index":1, "b":2}, 
      {"index":2, "b":3},
      {"index":3, "b":"10"},
      {"index":4, "c":"7"}]

l2 = [{"index":1, "c":4},
      {"index":2, "c":5},
      {"index":6, "c":8},
      {"index":7, "c":9}]

I would like to produce a joined list:

l3 = [{"index":1, "b":2, "c":4}, 
      {"index":2, "b":3, "c":5}, 
      {"index":3, "b":10},
      {"index":4, "c":7}]

What is the most efficient way to do this in Python?

Currently I have this piece of code, but it only does an inner join, how can I modify this to give me a left join?

def left_join(left, right, key):
    merged = {}
    for item in left+right:
        if item[key] in merged:
            merged[item[key]].update(item)
        else:
            merged[item[key]] = item
    return [val for (_, val) in merged.items()]

Are these data structures rigid, or do you have the choice to modify them a bit? — Subhash, Aug 10 '19 at 15:59
Unfortunately my problem has rigid data structures as shown in the original post. :( — Barry Lucky, Aug 10 '19 at 16:01

score 3 · Answer 1 · answered Aug 10 '19 at 16:23

3

The following snippet simply converts them to dictionaries for faster merge, and reconverts to the merged dictionary into a list to match your expected output.

l1_dict = {item['index']: item for item in l1}
l2_dict = {item['index']: item for item in l2}

for item in l1_dict:
    l1_dict[item].update(l2_dict.get(item, {}))

l3 = list(l1_dict.values())
print(l3)

answered Aug 10 '19 at 16:23

Subhash

3,121
1
19
25

This works as well. Thanks – Barry Lucky Aug 10 '19 at 16:38

Thierry Lathuille · Accepted Answer · 2019-08-10T16:39:48.613

For efficiency, you can start by building a dict with the indices as keys, and the corresponding dicts of l2 as values, so that you don't have to go through l2 each time you look for a matching dict in it.

You can then build a new list of dicts: for each dict in l1, we make a copy of it in order to leave the original unchanged, and update it with the matching dict from l2.

l1 = [{"index":1, "b":2}, {"index":2, "b":3}, {"index":3, "b":"10"}, {"index":4, "c":"7"}]

l2 = [{"index":1, "c":4}, {"index":2, "c":5}, {"index":6, "c":8}, {"index":7, "c":9}]

dict2 = {dct['index']:dct for dct in l2}

out = []
for d1 in l1:
    d = dict(**d1)
    d.update(dict2.get(d1['index'], {}))
    out.append(d)

print(out)
# [{'index': 1, 'b': 2, 'c': 4}, {'index': 2, 'b': 3, 'c': 5}, {'index': 3, 'b': '10'}, {'index': 4, 'c': '7'}]

Left join two lists of dictionaries on a single key

2 Answers2