Handle duplicate keys in python dictionary

Question

Sorry in advance if this question has been answered before but I can't seem to find it.

I have panda dataframe like so:

id | value1 | value2 | ... | valueN
1  | 321    | 44     | ... | 7766
2  | 5678   | 7638   | ... | 987423
2  | 0971   | 7638   | ... | 1
and so on...

I load it correctly and what I want to achieve is an OrderedDict which will collapse the double values if needed. For the above example,

the output dictionary should be:

{1: ['321', '44', ..., '7766'], 2:['5678,0971', '7638', ..., '987423,1']}

Notice that the values of the dictionary are list and the values of the list are strings.

My code so far is:

od = collections.OrderedDict()
for k in df.id:
        if k in od:
            # This key, pre-exists in this dictionary, so we have to append values
            # what should I do here?
        else:
            # new value inserted. proceed.
            od[k] = unordered_dict.get(k)

any ideas?

Load the collection with the key, append the values. Just like you said. Keep writing, you are on the right track. — DejaVuSansMono, Dec 06 '16 at 14:26
If the key already exists in the dictionary, you should append the list to the existing one using `.extend()`: `od[k].extend(unordered_dict.get(k))` — Ozgur Vatansever, Dec 06 '16 at 14:28
@dejavusansmono i got stuck in this part for over an hour, that's why I posted it here :P — Mixalis, Dec 06 '16 at 14:40
@DejaVuSansMono I'm not an expert in sarcasm like Sheldon Cooper, but I think that was one... — Mixalis, Dec 06 '16 at 14:44
@Mixalis Sorry, ozgur's comment should point you in the right direction. — DejaVuSansMono, Dec 06 '16 at 14:51

score 0 · Accepted Answer · answered Dec 06 '16 at 15:00

I think this is what you need, at least it worked on my dummy data:

all_data = {}                   
for column in df.columns.values[1:]:
    data = df.groupby('id').apply(lambda x: ','.join(x[column])).to_dict()
    for key in data:
        if key in all_data.keys():
            all_data[key].append(data[key])
        else:
            all_data[key] = [data[key]]

Handle duplicate keys in python dictionary

1 Answers1