5

I am trying to create a dictionary which summarizes another. I would like the summary_dict to only be updated when the key matches the "parent" value in the details_dict. What I have written does not seem to be accessing the summary_dict as I would expect. The print statements show that it keeps appending to the first iterable in the loop instead of grabbing the right value from the summary_dict.

detail_dict = {'a123': {"data": [1, 2, 3, 4], "parent": "a"},
               'a124': {"data": [1, 2, 3, 4, 5], "parent": "a"},
               'b123': {"data": [1, 2, 3], "parent": "b"},
               'b124': {"data": [1], "parent": "b"}}

summary_dict = dict.fromkeys(["a", "b"], {"data": [],
                                          "data_len": []})

for k, v in detail_dict.iteritems():
    summary_dict[v['parent']]["data"].append(v["data"])
    summary_dict[v['parent']]["data_len"].append(len(v["data"]))
    print "\nMy value is "
    print v
    print "\nMy summary dictionary now looks like:"
    print summary_dict[v['parent']]

The resultant dictionary I would like is:

{"a": {"data": [[1, 2, 3, 4], [1, 2, 3, 4, 5]],
       "data_len": [4, 5]},
 "b": {"data": [[1, 2, 3], [1]],
       "data_len": [3, 1]}}
LoveMeow
  • 1,141
  • 2
  • 15
  • 26

2 Answers2

7

you're passing a mutable parameter to from_keys so the reference is copied between keys.

Create your dict like this instead to create one separate ref for each key:

summary_dict = {x : {"data": [],"data_len": []} for x in ["a","b"]}

note that as the order of the dict is not guaranteed you get something like:

{'a': {'data_len': [5, 4], 'data': [[1, 2, 3, 4, 5], [1, 2, 3, 4]]}, 'b': {'data_len': [3, 1], 'data': [[1, 2, 3], [1]]}}

you could use

for k, v in sorted(detail_dict.items()):

to sort the items so order in the list is deterministic.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • Thanks so much! I never would have worked out that trap on my own. I thought it was for sure they way I was calling it in the loop – LoveMeow Aug 17 '17 at 16:20
5

dict.fromkeys method uses the same instance for each value.

summary_dict = dict.fromkeys(["a", "b"], {"data": [], "data_len": []})

Modifying the dictionary for one key will affect all keys. It's best to use dict.fromkeys only with an immutable object for the default value, to avoid this trap.

Try this dict comprehension instead:

summary_dict = {k: {"data": [], "data_len": []} for k in 'ab'}
Eliran Malka
  • 15,821
  • 6
  • 77
  • 100
wim
  • 338,267
  • 99
  • 616
  • 750