0

Consider the following dictionary.

dict = {
    'key_1': ['name1', 'name2', 'name3'],
    'key_2': ['name4', 'name5', 'name6']
}

Given a string called "name6", how to easy figure that this is belongs to "key_2" key with lesser looping. I have the following code, how can I optimize for time constraint of this. Above code is just an example, there are several such keys present in the dictionary.

dict = {
    'key_1': ['name1', 'name2', 'name3'],
    'key_2': ['name4', 'name5', 'name6']
}

output_key = None

for key in dict:
    if 'name6' in dict[key]:
        output_key = key
        break
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Krish V
  • 486
  • 1
  • 5
  • 19
  • Are the values unique per key, or can the same value appear for multiple keys? Do you need to optimise a single lookup, or do you need to invert many values? – MisterMiyagi Jan 24 '22 at 16:49
  • The length of the values list is not constant, it would vary key to key, same value within a list won't be present as value in the other key. I'm looking for optimized way that I can go with. Update: Adding break after the conditional match. – Krish V Jan 24 '22 at 16:53
  • Does this answer your question? [Inverse dictionary lookup in Python](https://stackoverflow.com/questions/2568673/inverse-dictionary-lookup-in-python) – mkrieger1 Jan 25 '22 at 10:36

2 Answers2

1

In the current form of this dict, you cannot find the key that stores "name6" without iterating over the keys.

You need a different data structure, you can create another dict where the keys are each unique string from the original dict and values are list of keys that hold that string in the original dict, that way you can check for that string in the new dict in O1, and get a list of keys that all contain that string in the old dict.

The creation of that new dict is O(n) (where n is the number of strings the original dict holds), but since its also the cost of the search you performed, that is still a win, if you need to lookup m values, you has to do O(m*n) before, but after its O(max(m,n))

dictionary = {
'key_1': ['name1', 'name2', 'name3'],
'key_2': ['name4', 'name5', 'name6']
}

new_dict = dict()
for key in dictionary:
    for value in dictionary[key]:
        if value in new_dict:
            new_dict[value].append(key)
        else:
            new_dict[value] = [key]

If you know from the start that each value can only be in one key (in the original dict) there is no need for the lists, but it's nice to have just in case ;)

0

This removes the loop and according to other answers seems to be faster, specially for small dictionaries, but (disclaimer) I did not benchmark or perform any performance testing.

Use next to get the first value that matches, or a default value (None):

mydict = {
    'key_1': ['name1', 'name2', 'name3'],
    'key_2': ['name4', 'name5', 'name6']
}

output_key = next(
    (mykey for mykey, myval in mydict.items() if 'name6' in myval),
    None
)

print(output_key) # key_2
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
evilmandarine
  • 4,241
  • 4
  • 17
  • 40