Data structure to implement a dictionary with multiple indexes?

Question

I am looking for a data structure that holds the same values under two different indexes, where I can access the data by either one.

Example:

x = mysticalDataStructure()
x.add(1,'karl', dog)
x.add(2,'lisa', cat)

$ x[1].age
2
$ x['karl'].age
2
$ x[1].age = 4
$ x['karl'].age
4

Is there anything prerolled, or what is the best approach to roll my own (I need access via an index (number going from 0 to n in increments of 1), and via a string).

collections.ordereddict does not seem to have fast random access via the position, as far as I see I can only walk it with the iterator until I reach element i (I can insert in the right order).

If you change a value through one key, do you expect the new value to be retrieved with the other key? — martineau, Jun 19 '12 at 16:39
See this answer / module: http://stackoverflow.com/questions/11449232/multiple-keys-per-value/16966988#16966988 — formiaczek, Sep 09 '13 at 18:56

Trevor · Answer 1 · 2012-06-19T16:28:52.913

12

Is there a particular reason you can't just use a dictionary:

x = {}
x[1] = x['karl'] = dog
x[2] = x['lisa'] = cat

Then you can access it by either.

If you really don't want to repeat your self you do this:

class MysticalDataStructure(dict):
    def add(self, key1, key2, value):
        return self[key1] = self[key2] = value

x = MysticalDataStructure()
x.add(1, 'karl', dog)
x.add(2, 'lisa', cat)

edited Jun 19 '12 at 16:28

answered Jun 19 '12 at 16:22

Trevor

9,518
2
25
26

This is also nice, using a shared index. – Has QUIT--Anony-Mousse Jun 19 '12 at 16:23
wouldn't modifying x[1] not modify x['karl']? – Hans Z Jun 19 '12 at 16:24
@Hans of course it would be modified, it is the same object - objects are passed as references in python (by value of the object reference - see http://stackoverflow.com/a/986145/1176601) – Aprillion Jun 19 '12 at 16:26
2

x[1] = x['karl'] = 3, x[1] = 2 does not change x['karl'] – Hans Z Jun 19 '12 at 16:28
@Hans i guess this needs to be clarified - any answer including this one will only ever works for **mutable types** of course (3 is an immutable integer literal) - if you need to update 2 immutables of the same value at the same time, the best way is to put the immutable into an object and have 2 references to the 1 mutable object – Aprillion Jun 19 '12 at 16:38
+1 for simpleness (kiss) but astynax solution is better in my particular case (otherwise I could go with yours and wrap all my immutables, but I have mure immutables than mutables so that is not to practical – ted Jun 20 '12 at 09:09
Python uses a Dictionary as a Map. You don't want to do that in real life though :) – NoBugs Jan 28 '14 at 06:26
Brilliant and simple! – Gill Bates Feb 13 '14 at 21:46
Removing a value for all keys from the dict poses a problem. – ed22 Jul 28 '20 at 07:52

Aleksei astynax Pirogov · Accepted Answer · 2012-06-22T11:28:18.473

12

class MultiKeyDict(object):

    def __init__(self, **kwargs):
        self._keys = {}
        self._data = {}
        for k, v in kwargs.iteritems():
            self[k] = v

    def __getitem__(self, key):
        try:
            return self._data[key]
        except KeyError:
            return self._data[self._keys[key]]

    def __setitem__(self, key, val):
        try:
            self._data[self._keys[key]] = val
        except KeyError:
            if isinstance(key, tuple):
               if not key:
                  raise ValueError(u'Empty tuple cannot be used as a key')
               key, other_keys = key[0], key[1:]
            else:
               other_keys = []
            self._data[key] = val
            for k in other_keys:
                self._keys[k] = key

    def add_keys(self, to_key, new_keys):
        if to_key not in self._data:
            to_key = self._keys[to_key]
        for key in new_keys:
            self._keys[key] = to_key


    @classmethod
    def from_dict(cls, dic):
        result = cls()
        for key, val in dic.items():
            result[key] = val
        return result

Usage:

>>> d = MultiKeyDict(a=1, b=2)
>>> d['c', 'd'] = 3 # two keys for one value
>>> print d['c'], d['d']
3 3
>>> d['c'] = 4
>>> print d['d']
4
>>> d.add_keys('d', ('e',))
>>> d['e']
4
>>> d2 = MultiKeyDict.from_dict({ ('a', 'b'): 1 })
>>> d2['a'] = 2
>>> d2['b']
2

edited Jun 22 '12 at 11:28

answered Jun 19 '12 at 17:12

Aleksei astynax Pirogov

2,483
15
19

1

Probably ought to have a `__delitem__()` too. – martineau Jun 19 '12 at 17:27
@martineau, this is only sketch to explain the concept – Aleksei astynax Pirogov Jun 19 '12 at 17:51
perfect, this is pure awesome, while i need to do two list accesses I can also store instances of immutable types. – ted Jun 20 '12 at 09:07
@martineau considering completness, yes, but I won't need that in my case – ted Jun 20 '12 at 09:07
@ted, i slightly modified the class - now changes by one key reflects to the other (value if the same) – Aleksei astynax Pirogov Jun 20 '12 at 10:14
@astynax I did so myself already, I also added a delete function. By the way I do the indexing by keeping an class internal counter, that way I only get collisions when I overflow the integer, and not by chance. For deletion iterate over the keys dict and delete all entries which map to the same key value. (one could keep backreferences but this would be only usefull in case one deletes frequently form huge sets. Thanks for the update. I wont post my code here, its not pythonic enough i look before i leap,... and you already updated it, I came back in the first place to show my modifications. – ted Jun 20 '12 at 14:16
FWIW, here's a couple of optimizations. It's not necessary to call `super(MultiKeyDict, self).__init__()` since it does nothing. Using `uuid4` is overkill and having a minimal `class unique: pass` and calling `unique()` instead works fine since instances of user-defined classes are hashable by default and they all compare unequal -- which is all that's needed for this. – martineau Jun 20 '12 at 16:39
@martineau, thanks for your comment! `super(...).__init__` was removed. But the empty class, as unique value is overkill too, and quite implicit. I just replaced `uuid4()` by `uuid4().hex` - for now the unique key is just a string (with a guaranteed uniqueness) – Aleksei astynax Pirogov Jun 21 '12 at 01:46
@astynax: The empty class instance's `id()` is used as it's hash value, so each one will be unique, too. If you insist on using `uuid4`, it's hash value _is_ its unique value, so there's no advantage to using `uuid4().hex` (or `uuid4().int` for that matter). In any case, using `uuid4()` instances requires more cycles and memory that just using an empty class's hash/id which is just its memory address. `uuid4()` might be need for a persistent class, where long-term uniqueness would be required instead of just run-time. Regardless, nice-answer! – martineau Jun 21 '12 at 02:27
@astynax: Actually one doesn't even need to define an empty class -- just using `object()` instead of `uuid4()` works perfectly. To make it less implicit, it could be aliased by using a different name, like `UniqueID = object` or something along those lines. Of course some comments in the code explaining what was happening would also address the issue...although that's probably considered old-fashioned. – martineau Jun 21 '12 at 11:30
@martineau, uuid removed. The first key is be used a as key for data, and then additional keys will be linked to the first one. `add_keys` method was implemented. – Aleksei astynax Pirogov Jun 22 '12 at 07:27
@astynax: Definitely a favorable change. Makes it more efficient, use less memory, and renders the whole `uuid` discussion irrelevant in the process -- so my original upvote stands ;-). BTW changing the signature of your new method to `def add_keys(self, to_key, *args):` might make using it easier. In addition, allowing one or more arguments along [these lines](http://stackoverflow.com/a/1921146/355230) to be passed to `__init__()` might also be a worthwhile addition. – martineau Jun 22 '12 at 09:45
@martineau, `add_keys` not use the `*args` to visually differentiate the primary key and the additionals. I think, the closure `d.add_keys('a')('b', 'c')` can be used. – Aleksei astynax Pirogov Jun 22 '12 at 11:20
@martineau, `from_dict` class-method can used to construct object from sketch – Aleksei astynax Pirogov Jun 22 '12 at 11:30
@astynax: OK, I now understand why you did it without `*args`. However, am not clear about what you said regarding using a closure -- no big deal. – martineau Jun 22 '12 at 11:32
@astynax: `from_dict()` looks like a good addition. Sorry I can't upvote your answer more than once. – martineau Jun 22 '12 at 11:44

score 1 · Answer 3 · answered Jun 19 '12 at 16:20

Just use three maps.

maps = [dict(), dict(), dict()]

def insert(rec):
   maps[0][rec[0]] = rec
   maps[1][rec[1]] = rec
   maps[2][rec[2]] = rec

Changes to key attributes of the rec object will require reinsertion though. Just like any other map, when you change the key of an object.

The maps just map key -> object, after all. They don't actually store copies of the object (it just isn't garbage collected). So a map is an index, nothing more. If you want three indexes, use three maps. Write a couple of glue code functions to manage them.

As mentioned by Trevor, you can also use a shared dictionary:

index = dict()

def insert(rec):
    index[rec[0]] = rec
    index[rec[1]] = rec
    index[rec[2]] = rec

then you can access it by either.

Beware of key collisions though!

+1 answers my badly formulated example, but you forgot for example that I also have indices wich dont come from the object, e.g. `1` like an id for the `dog` — ted, Jun 20 '12 at 09:11

Data structure to implement a dictionary with multiple indexes?

3 Answers3

Linked

Related