19

I have a project of converting one database to another. One of the original database columns defines the row's category. This column should be mapped to a new category in the new database.

For example, let's assume the original categories are:parrot, spam, cheese_shop, Cleese, Gilliam, Palin

Now that's a little verbose for me, And I want to have these rows categorized as sketch, actor - That is, define all the sketches and all the actors as two equivalence classes.

>>> monty={'parrot':'sketch', 'spam':'sketch', 'cheese_shop':'sketch', 
'Cleese':'actor', 'Gilliam':'actor', 'Palin':'actor'}
>>> monty
{'Gilliam': 'actor', 'Cleese': 'actor', 'parrot': 'sketch', 'spam': 'sketch', 
'Palin': 'actor', 'cheese_shop': 'sketch'}

That's quite awkward- I would prefer having something like:

monty={ ('parrot','spam','cheese_shop'): 'sketch', 
        ('Cleese', 'Gilliam', 'Palin') : 'actors'}

But this, of course, sets the entire tuple as a key:

>>> monty['parrot']

Traceback (most recent call last):
  File "<pyshell#29>", line 1, in <module>
    monty['parrot']
KeyError: 'parrot'

Any ideas how to create an elegant many-to-one dictionary in Python?

martineau
  • 119,623
  • 25
  • 170
  • 301
Adam Matan
  • 128,757
  • 147
  • 397
  • 562

4 Answers4

18

It seems to me that you have two concerns. First, how do you express your mapping originally, that is, how do you type the mapping into your new_mapping.py file. Second, how does the mapping work during the re-mapping process. There's no reason for these two representations to be the same.

Start with the mapping you like:

monty = { 
    ('parrot','spam','cheese_shop'): 'sketch', 
    ('Cleese', 'Gilliam', 'Palin') : 'actors',
}

then convert it into the mapping you need:

working_monty = {}
for k, v in monty.items():
    for key in k:
        working_monty[key] = v

producing:

{'Gilliam': 'actors', 'Cleese': 'actors', 'parrot': 'sketch', 'spam': 'sketch', 'Palin': 'actors', 'cheese_shop': 'sketch'}

then use working_monty to do the work.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • 4
    +1 Thanks a lot. I assume there's no python native type for this job; Do you think there should be one? – Adam Matan Dec 17 '09 at 11:38
  • can't we have some reference as the value in the (key, value) pair rather than storing the actual string? Since the no. of keys are significantly larger than the no. of values, this would save a lot of space. Is there a way to do this? – ishan3243 Mar 25 '14 at 18:29
  • Old question, but regarding @ishan3243's observation, I am pretty sure Python will intern these strings, since they are being defined explicitly as constants. Furthermore, even if the values are read in at run-time, because of how this code loops and assigns the same variable to each index, it should cause string interning. – Spencer D Dec 30 '18 at 07:11
5

You could override dict's indexer, but perhaps the following simpler solution would be better:

>>> assoc_list = ( (('parrot','spam','cheese_shop'), 'sketch'), (('Cleese', 'Gilliam', 'Palin'), 'actors') )
>>> equiv_dict = dict()
>>> for keys, value in assoc_list:
    for key in keys:
        equiv_dict[key] = value


>>> equiv_dict['parrot']
'sketch'
>>> equiv_dict['spam']
'sketch'

(Perhaps the nested for loop can be compressed an impressive one-liner, but this works and is readable.)

Vladimir Gritsenko
  • 1,669
  • 11
  • 25
3
>>> monty={ ('parrot','spam','cheese_shop'): 'sketch', 
        ('Cleese', 'Gilliam', 'Palin') : 'actors'}

>>> item=lambda x:[z for y,z in monty.items() if x in y][0]
>>>
>>> item("parrot")
'sketch'
>>> item("Cleese")
'actors'

But let me tell you, It will be slow than normal one to one dictionary.

YOU
  • 120,166
  • 34
  • 186
  • 219
  • Slow-ish, but on the plus side doesn't require a persistent secondary data structure. Could be sped up a certain degree by not being written as a lambda and using a list comprehension. – martineau Jun 24 '12 at 01:52
2

If you want to have multiple keys pointing to the same value, i.e.

m_dictionary{('k1', 'k2', 'k3', 'k4'):1, ('k5', 'k6'):2} and access them as,

`print(m_dictionary['k1'])` ==> `1`.

Check this multi dictionary python module multi_key_dict. Install and Import it. https://pypi.python.org/pypi/multi_key_dict

psun
  • 615
  • 10
  • 13