-1

What is the best way to create a dict from two other dicts (very big one and small one)?

We have:

    big_dict = {
    'key1':325,
    'key2':326,
    'key3':327,
    ...
    }

    small_dict = {
    325:0.698,
    326:0.684,
    327:0.668
    }

Needs to get a dict for data in small_dict, but we should use keys from big_dict:

    comb_dict = {
    'key1':0.698,
    'key2':0.684,
    'key3':0.668
    }
Dmitri K.
  • 123
  • 2
  • 8
  • 1
    What should happen if the value from `big_dict` isn't a key in `small_dict`? – Patrick Haugh Oct 21 '17 at 19:28
  • It is a good point. But in this practical task both dicts came from one function (TfidfVectorizer()). So in this case there are values in big_dict for each key of small_dict. – Dmitri K. Oct 22 '17 at 08:23

4 Answers4

2

The following code works with all cases (example shown in the driver values), with a more EAFP oriented approach.

>>> d = {}
>>> for key,val in big_dict.items(): 
        try: 
            d[key] = small_dict[val] 
        except KeyError: 
            continue

=> {'key1': 0.698, 'key2': 0.684, 'key3': 0.668}

#driver values :

IN : big_dict = {
        'key1':325,
        'key2':326,
        'key3':327,
        'key4':330        #note that small_dict[330] will give KeyError
     }

IN : small_dict = {
          325:0.698,
          326:0.684,
          327:0.668
      }

Or, using Dictionary Comprehension :

>>> {key:small_dict[val] for key,val in big_dict.items() if val in small_dict}

=> {'key1': 0.698, 'key2': 0.684, 'key3': 0.668}
Kaushik NP
  • 6,733
  • 9
  • 31
  • 60
  • `d[key] = small_dict[val]` should probably be `d[key] = deepcopy(small_dict[val])`. If small_dict's cal is not an immutable object (string, int, ect...) if down the line `d[key]` is changed with a append() (if it is a list) or any other object specific method it will have the side effect of changing `small_dict`. Also if deepcopy() is called on an immutable object there is almost no performance loss. Python will still keep the memory reference just like a normal `=` would. – PeterH Oct 21 '17 at 20:04
1

You could use dictionary comprehension:

comb_dict = {k: small_dict[v] for k, v in big_dict.iteritems()}

If big_dict may contain values that are not keys in small_dict you could just ignore them:

comb_dict = {k: small_dict[v] for k, v in big_dict.iteritems() if v in small_dict}

or use the original value:

{k: (small_dict[v] if v in small_dict else v) for k, v in big_dict.iteritems()}

(Use items() in Python3)

Neo
  • 3,534
  • 2
  • 20
  • 32
  • Comprehensions are cool, but they're a little hard to read sometimes. Especially the last example here with the ternary expression--you probably wouldn't want to do that in production code. –  Oct 21 '17 at 20:01
  • @Wyatt I agree about the ternary expression, I added it for completeness. I added parentheses to make it more readable. – Neo Oct 21 '17 at 20:11
1

If there are values in big_dict that may not be present as keys in small_dict, this will work:

combined_dict = {}
for big_key, small_key in big_dict.items():
    combined_dict[big_key] = small_dict.get(small_key)

Or you might want to use a different default value instead with:

    combined_dict[big_key] = small_dict.get(small_key, default='XXX')

Or you might want to raise a KeyError to indicate a problem with your data:

    combined_dict[big_key] = small_dict[small_key]

Or you might want to skip missing keys:

    if small_key in small_dict:
        combined_dict[big_key] = small_dict[small_key]
  • 1
    The one problem with that is if `small_dict[small_key]` has a value that is a complex object (dict, list, ect...) when you do `combined_dict[big_key] = small_dict[small_key]` you will be giving the combined_dict a pointer to the small_dicts data. So if you ever change the combined_dict data it will change it in the small_dict as well. A simple `deepcopy()` on the value set would fix it. – PeterH Oct 21 '17 at 19:46
  • @PeterH That's a good point, but I couldn't address all possible scenarios ;) –  Oct 21 '17 at 19:53
0
keys = small_dict.keys()
combined_dict = {k:small_dict[v] for k,v in big_dict.items() if v in keys}
>>> combined_dict
{'key3': 0.668, 'key2': 0.684, 'key1': 0.698}
jabargas
  • 210
  • 1
  • 3