4

Given n lists with m dictionaries as their elements, I would like to produce a new list, with a joined (left join) set of dictionaries. Each dictionary is guaranteed to have a key called index, but could have an arbitrary set of keys beyond that. For example, imagine the following two lists:

l1 = [{"index":1, "b":2}, 
      {"index":2, "b":3},
      {"index":3, "b":"10"},
      {"index":4, "c":"7"}]

l2 = [{"index":1, "c":4},
      {"index":2, "c":5},
      {"index":6, "c":8},
      {"index":7, "c":9}]

I would like to produce a joined list:

l3 = [{"index":1, "b":2, "c":4}, 
      {"index":2, "b":3, "c":5}, 
      {"index":3, "b":10},
      {"index":4, "c":7}]

What is the most efficient way to do this in Python?

Currently I have this piece of code, but it only does an inner join, how can I modify this to give me a left join?

def left_join(left, right, key):
    merged = {}
    for item in left+right:
        if item[key] in merged:
            merged[item[key]].update(item)
        else:
            merged[item[key]] = item
    return [val for (_, val) in merged.items()]
rsm
  • 2,530
  • 4
  • 26
  • 33
Barry Lucky
  • 49
  • 1
  • 9

2 Answers2

3

The following snippet simply converts them to dictionaries for faster merge, and reconverts to the merged dictionary into a list to match your expected output.

l1_dict = {item['index']: item for item in l1}
l2_dict = {item['index']: item for item in l2}

for item in l1_dict:
    l1_dict[item].update(l2_dict.get(item, {}))

l3 = list(l1_dict.values())
print(l3)
Subhash
  • 3,121
  • 1
  • 19
  • 25
0

For efficiency, you can start by building a dict with the indices as keys, and the corresponding dicts of l2 as values, so that you don't have to go through l2 each time you look for a matching dict in it.

You can then build a new list of dicts: for each dict in l1, we make a copy of it in order to leave the original unchanged, and update it with the matching dict from l2.

l1 = [{"index":1, "b":2}, {"index":2, "b":3}, {"index":3, "b":"10"}, {"index":4, "c":"7"}]

l2 = [{"index":1, "c":4}, {"index":2, "c":5}, {"index":6, "c":8}, {"index":7, "c":9}]

dict2 = {dct['index']:dct for dct in l2}

out = []
for d1 in l1:
    d = dict(**d1)
    d.update(dict2.get(d1['index'], {}))
    out.append(d)

print(out)
# [{'index': 1, 'b': 2, 'c': 4}, {'index': 2, 'b': 3, 'c': 5}, {'index': 3, 'b': '10'}, {'index': 4, 'c': '7'}]
Thierry Lathuille
  • 23,663
  • 10
  • 44
  • 50