Let's say I have a dictionary with the following contents:
old_dict = {'a':[0,1,2], 'b':[1,2,3]}
and I want to obtain a new dictionary where the keys are the values in the old dictionary, and the new values are the keys from the old dictionary, i.e.:
new_dict = {0:['a'], 1:['a','b'], 2:['a','b'], 3:['b']}
To perform this task, I'm currently using the following example code:
# get all the keys for the new dictionary
new_keys = np.unique(np.hstack([old_dict[key] for key in old_dict]))
# initialize new dictionary
new_dict = {key: [] for key in new_keys}
# step through every new key
for new_key in new_keys:
# step through every old key and check if the new key the current list of values
for old_key in old_dict:
if new_key in old_dict[old_key]:
new_dict[new_key].append(old_key)
In this example I'm showing 2 old keys and 4 new keys, but for my problem I have ~10,000 old keys and ~100,000 new keys. Is there a smarter way to perform my task, maybe with some tree-based algorithm? I used dictionaries because they are easier for me to visualize the problem, but dictionaries can be necessary if there are better data types for this exercise.
In the meantime, I'm looking into documentations for reverse lookup of dictionaries, and trying to manipulate this using sindex from geopandas.