0

I have a list of terms and a dictionary and I want to use the list to get the values of the dictionary based on these keys. Let's say

d = {'400cm': 'John Doe', 'Go - Pro': 'Mary Smith'}
l = ['400 cm', 'Go-Pro']

I am wondering how I could account for cases where the key has a slightly different form than the element of the list. For example:

key = '400cm' 
list_term = '400 cm'

or

key = 'Go - Pro'
list_term = 'Go-Pro'

Basically how could I ensure similarity on the token level?

Paschalis
  • 191
  • 10
  • you can't. you will need to sanitize your terms first – drum Jun 15 '21 at 22:05
  • You cannot. I mean, you could come up with some way to match two terms, and then iterate through your dictionary and try to match the keys, but that defeats the whole point of a dictionary. Probably, you want to take a step back and re-think your approach, try to normalize this input somehow – juanpa.arrivillaga Jun 15 '21 at 22:06
  • Here’s some discussion of how to make dictionary lookup case-insensitive. Not trivial in the general case but you may be able to get something usable for specific/narrower use-cases. https://stackoverflow.com/questions/2082152/case-insensitive-dictionary#32888599 – DisappointedByUnaccountableMod Jun 15 '21 at 22:24

1 Answers1

1

You need to work with a canonical form for your keys. For example, you could always remove white spaces from the incoming keys before storing/looking up elements in the dictionary. You could also always convert to lowercase letters and so on.

But as soon as the key is formed, there is no intrinsic method of finding "similar" keys (however you would define that) in a dictionary.

ypnos
  • 50,202
  • 14
  • 95
  • 141