2

Suppose we have 2 dictionaries:

a = {
    "key1": "value1",
    "key2": "value2",
    "key3": {
        "key3_1": "value3_1",
        "key3_2": "value3_2"
    }
}

b = {
    "key1": "not_key1",
    "key4": "something new",
    "key3": {
        "key3_1": "Definitely not value3_1",
        "key": "new key without index?"
    }
}

As a result of the merger, I need to get the following dictionary:

{
    "key1": "not_key1",
    "key2": "value2",
    "key3": {
        "key3_1": "Definitely not value3_1",
        "key3_2": "value3_2",
        "key": "new key without index?"
    },
    "key4": "something new"
}

I have this kind of code:

def merge_2_dicts(dict1, dict2):
    for i in dict2:
        if not type(dict2[i]) == dict:
            dict1[i] = dict2[i]
        else:
            print(dict1[i], dict2[i], sep="\n")
            dict1[i] = merge_2_dicts(dict1[i], dict2[i])
    return dict1

It works and gives me the desired result, but I'm not sure if it can be done more simply. Is there an easier/shorter option?

  • Your solution seems good, and most importantly it works. My only concern is what happens if there's a key in dict1 that's not present in dict2. At the moment, you're missing it. – aaossa Mar 08 '22 at 14:01
  • @aaossa, well, I didn't seem to have that problem. In the example, just `["key_3"]["key_3_2"]` is in `a`, but not in `b`. And in the output just the dictionary I need. –  Mar 08 '22 at 14:05
  • Maybe you can use an external package like Numpy? – Stefan Mar 08 '22 at 14:05
  • @Stefan, to be honest, I wouldn't want to import an entire library for the sake of 1 small function. –  Mar 08 '22 at 14:06
  • 2
    One possible improvement is `isinstance(X, dict)` instead of `type(X) == dict`. See [this question](https://stackoverflow.com/a/2225066/5930169). – knia Mar 08 '22 at 14:08
  • @knia, thx, I completely forgot about it. –  Mar 08 '22 at 14:11
  • 1
    your requirement is *merge two dicts*, but the dicts order seems affecting your result. in general *merging two dicts* shouldn't depend on the orders. and, it is best if you can extend your code to work with *merging multiple dicts*. i'm not goint to write any code unless you clarify the impact of order to the expected result. – Lei Yang Mar 08 '22 at 14:11

4 Answers4

1

I think your code is almost good. I see only issue what if key is missing in target dictionary?

def merge_dicts(tgt, enhancer):
    for key, val in enhancer.items():
        if key not in tgt:
            tgt[key] = val
            continue

        if isinstance(val, dict):
            merge_dicts(tgt[key], val)
        else:
            tgt[key] = val
    return tgt

This code, do most of the same what you have written.

  1. check if key present in target dict, if not update regardless type.
  2. if val is dict, then we use recusion
  3. if val is not dict then update from enhancing dict

But I see still one issue what if in target dict value is string and in enhancer value is dict?

enhancer = {
    "key3": {
        "key3_1": "value3_1",
        "key3_2": "value3_2"
    }
}

tgt = {
    "key3": "string_val"
}

Then it depends what do you prefer:

  1. Overwrite string with dict from enhancer:
def merge_dicts(tgt, enhancer):
    for key, val in enhancer.items():
        if key not in tgt:
            tgt[key] = val
            continue

        if isinstance(val, dict):
            if not isinstance(tgt[key], dict):
                tgt[key] = dict()
            merge_dicts(tgt[key], val)
        else:
            tgt[key] = val
    return tgt
  1. Keep string value from target dict:
def merge_dicts(tgt, enhancer):
    for key, val in enhancer.items():
        if key not in tgt:
            tgt[key] = val
            continue

        if isinstance(val, dict):
            if not isinstance(tgt[key], dict):
                continue
            merge_dicts(tgt[key], val)
        else:
            tgt[key] = val
    return tgt
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Peter Trcka
  • 1,279
  • 1
  • 16
  • 21
0

Another solution:

from copy import deepcopy
from typing import Any


def is_all_dict(a1: Any, a2: Any) -> bool:
    return isinstance(a1, dict) and isinstance(a2, dict)


def recursively_merge(d1: dict, d2: dict) -> dict:
    d = deepcopy(d1)
    for k, v2 in d2.items():
        if (v := d.get(k)) and is_all_dict(v, v2):
            sub_dicts = []
            for sk, sv2 in v2.items():
                if (sv := v.get(sk)) and is_all_dict(sv, sv2):
                    sub_dicts.append((sv, sv2))
                else:
                    v[sk] = sv2
            while sub_dicts:
                sds = []
                for v, v2 in sub_dicts:
                    for sk, sv2 in v2.items():
                        if (sv := v.get(sk)) and is_all_dict(sv, sv2):
                            sds.append((sv, sv2))
                        else:
                            v[sk] = sv2
                sub_dicts = sds
        else:
            d[k] = v2
    return d

Output:

In [26]: import pprint

In [27]: pprint.pprint(recursively_merge(a, b))
{'key1': 'not_key1',
 'key2': 'value2',
 'key3': {'key': 'new key without index?',
          'key3_1': 'Definitely not value3_1',
          'key3_2': 'value3_2'},
 'key4': 'something new'}
Waket Zheng
  • 5,065
  • 2
  • 17
  • 30
0

If you want something really shorthand using dictionary comprehensions, you could use the below approach.

NB: By using .get(k) in the if statement, we avoid having to check whether k is in the dictionary

def merge_dicts(d1, d2):
    check = lambda k, v: isinstance(d1.get(k), dict) and isinstance(v, dict)
    return {**d1, **{k: merge_dicts(d1[k], d2[k]) if check(k, v) else v for k, v in d2.items()}}

Output:

>>> from pprint import pprint
>>> pprint(merge_dicts(a,b))
{'key1': 'not_key1',
 'key2': 'value2',
 'key3': {'key': 'new key without index?',
          'key3_1': 'Definitely not value3_1',
          'key3_2': 'value3_2'},
 'key4': 'something new'}
oskros
  • 3,101
  • 2
  • 9
  • 28
  • Yes this is definetly less rows in code but it is not effective. You are iterating thru list of items created from both lists. This mean that you are and reassign same values. If target dict has 1000 items, and enhance dict just one item, this will do 1001 iterations instead of 1. – Peter Trcka Mar 08 '22 at 16:26
  • can you try to improve it? definetly good idea. – Peter Trcka Mar 08 '22 at 16:32
  • @PeterTrcka thats a great point - I have modified my solution to unpack `d1` and only loop over values in `d2` – oskros Mar 08 '22 at 18:39
  • 1
    According to author, no need to update d1 if key is present in both dicts. We want to just enhance d1 with new keys from d2. There can be saved some extra time, if you pass to labmda `v` instead of `d2`. When I have implemeted this adjusments to your existing enhancement according to timeit performance was improved -> 3s (original) -> 1.5 (your first refactoring) -> 0.6 (after my proposal). `check = lambda k, v: isinstance(d1.get(k), dict) and isinstance(v, dict)` and second row: `return {**d1, **{k: merge_dicts(d1[k], d2[k]) if check(k, v) else v for k, v in d2.items() if k not in d1}}`. – Peter Trcka Mar 08 '22 at 19:44
-2

This is a more simple solution to what you have. I believe you need at least 3.7+

c = {**a, **b}
John Stud
  • 1,506
  • 23
  • 46