130

Python dict is a very useful data-structure:

d = {'a': 1, 'b': 2}

d['a'] # get 1

Sometimes you'd also like to index by values.

d[1] # get 'a'

Which is the most efficient way to implement this data-structure? Any official recommend way to do it?

martineau
  • 119,623
  • 25
  • 170
  • 301
Juanjo Conti
  • 28,823
  • 42
  • 111
  • 133
  • If you prefer, we can assume that values are immutable as well as keys are. – Juanjo Conti Jul 23 '10 at 13:39
  • 6
    What would you return for this dict: {'a' : 1, 'b': 2, 'A' : 1 } – PaulMcG Jul 23 '10 at 16:49
  • 5
    @PaulMcGuire : I would return `{1: ['a', 'A'], 2: 'b'}`. See my answer for such a way to do it. – Basj Feb 19 '14 at 22:41
  • 4
    Note to moderator: this is **not** a duplicate of https://stackoverflow.com/questions/1456373/two-way-reverse-map. The latter has 1) very vague wording 2) no MCVE 3) only deals with the case of the bijective map (see first comment in this question), which is a lot more restrictive than this actual question, which is more general. So I think marking it as duplicate is here, in this particular case, misleading. If really one should be a duplicate of another, it should be the contrary as this one here covers the general case whereas the other (see answers) doesn't cover the non-bijective case. – Basj Jun 19 '18 at 09:13
  • @Basj It should return `{1: {'a', 'A'}, 2: 'b'}` – endolith Feb 16 '21 at 01:41
  • 2
    This question is more than ten years old, but I am reading it for the first time now. You might find inspiration in the Java library Google Guava. They have a class `HashBiMap` that is worth reading. (I assume a similar structure could be implemented in Python.) The documentation clearly outlines edge cases and how they are handled. Ref: https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/HashBiMap.java – kevinarpe Apr 27 '21 at 05:59

8 Answers8

100

Here is a class for a bidirectional dict, inspired by Finding key from value in Python dictionary and modified to allow the following 2) and 3).

Note that :

    1. The inverse directory bd.inverse auto-updates itself when the standard dict bd is modified.
    1. The inverse directory bd.inverse[value] is always a list of key such that bd[key] == value.
    1. Unlike the bidict module from https://pypi.python.org/pypi/bidict, here we can have 2 keys having same value, this is very important.

Code:

class bidict(dict):
    def __init__(self, *args, **kwargs):
        super(bidict, self).__init__(*args, **kwargs)
        self.inverse = {}
        for key, value in self.items():
            self.inverse.setdefault(value, []).append(key) 

    def __setitem__(self, key, value):
        if key in self:
            self.inverse[self[key]].remove(key) 
        super(bidict, self).__setitem__(key, value)
        self.inverse.setdefault(value, []).append(key)        

    def __delitem__(self, key):
        self.inverse.setdefault(self[key], []).remove(key)
        if self[key] in self.inverse and not self.inverse[self[key]]: 
            del self.inverse[self[key]]
        super(bidict, self).__delitem__(key)

Usage example:

bd = bidict({'a': 1, 'b': 2})  
print(bd)                     # {'a': 1, 'b': 2}                 
print(bd.inverse)             # {1: ['a'], 2: ['b']}
bd['c'] = 1                   # Now two keys have the same value (= 1)
print(bd)                     # {'a': 1, 'c': 1, 'b': 2}
print(bd.inverse)             # {1: ['a', 'c'], 2: ['b']}
del bd['c']
print(bd)                     # {'a': 1, 'b': 2}
print(bd.inverse)             # {1: ['a'], 2: ['b']}
del bd['a']
print(bd)                     # {'b': 2}
print(bd.inverse)             # {2: ['b']}
bd['b'] = 3
print(bd)                     # {'b': 3}
print(bd.inverse)             # {2: [], 3: ['b']}
Basj
  • 41,386
  • 99
  • 383
  • 673
  • 3
    Very neat solution of the ambiguous case! – Tobias Kienzler Feb 20 '14 at 11:47
  • 2
    I think this data structure is very useful in many practical problems. – 0xc0de Jul 23 '15 at 16:08
  • 9
    **This is phenomenal.** It's succinct; it's self-documenting; it's reasonably efficient; it just works. My only quibble would be to optimize the repeated lookups of `self[key]` in `__delitem__()` with a single `value = self[key]` assignment reused for such lookups. But... _yeah._ That's negligible. Thanks for the pure awesome, [Basj](https://stackoverflow.com/users/1422096/basj)! – Cecil Curry Jun 28 '16 at 02:18
  • 1
    How about a Python 3 version? – zelusp Sep 14 '16 at 20:01
  • Have you tried this code in Py3 yet, @zelusp ? I didn't spot any issues on a quick scan through it. – The Nate Oct 19 '16 at 22:53
  • I did @TheNate. I get error `AttributeError: 'bidict' object has no attribute 'iteritems'` when running `bd = bidict({'a': 1, 'b': 2})` – zelusp Oct 20 '16 at 01:13
  • 2
    Ah. Right. Try it without "iter" and it should work. – The Nate Oct 20 '16 at 01:40
  • 1
    I like this answer for the example. The accepted answer is correct still and I think the accepted answer should stay as the accepted answer, but this is a little bit more explicit for defining it yourself, merely because it clearly lays out that in order to reverse the dictionary you must place the reverse's values into a list since there cannot be a one-to-one mapping because a dictionary has a one-to-many relationship with key-to-values. – searchengine27 Feb 17 '17 at 19:20
  • Can this be packaged into bidict? – CMCDragonkai May 14 '19 at 06:52
  • 1
    this doesn't support many to many in both directions. bd = bidict({'a': [1,2,3], 'b': [2,3,4]}) errors about unhashable type. – Stephen Eckels Dec 17 '19 at 20:47
  • This breaks when using `bd.pop(some_key, None)` – rwols Mar 14 '20 at 14:34
  • I wrote my own BiDict based on what i've seen here and without {key -> key} issue. Also my implementation has same methods and behavior, that you expect from a dict. Please ping me here if you find some uncovered cases: https://gist.github.com/titovanton/2225805da5ccdd788349bbf7f1eef1a8 – AntonTitov Sep 29 '20 at 10:36
  • Will this get inefficient if we have a large number of keys with the same value then set some of these to different values? Would replacing the lists in inverse with sets solve this? – Sideshow Bob Nov 17 '20 at 12:36
  • Nice solution! I was just wondering if it is really desiderable to possibly have an empty list as value in the inverse dict after a key in the original dict changed its value ( I am referring to the last three lines in the usage example here). Would'nt it be better to check for that in __setitem__? `if self.inverse[self[key]] == []: del self.inverse[self[key]]` – Ch L Nov 18 '20 at 13:16
  • I would use sets instead of lists in the inverse dictionary. – nnov Mar 19 '22 at 00:48
55

You can use the same dict itself by adding key,value pair in reverse order.

d={'a':1,'b':2}
revd=dict([reversed(i) for i in d.items()])
d.update(revd)
Emil
  • 13,577
  • 18
  • 69
  • 108
  • 6
    +1 A nice, practical solution. Another way to write it: `d.update( dict((d[k], k) for k in d) )`. – FMc Jul 23 '10 at 20:04
  • 4
    +1 For neat use of reversed(). I'm undecided if it's more readable than the explicit `dict((v, k) for (k, v) in d.items())`. In any case, you can pass pairs directly to .update: `d.update(reversed(i) for i in d.items())`. – Beni Cherniavsky-Paskin Aug 14 '12 at 10:50
  • 29
    Note this fails e.g. for `d={'a':1, 'b':2, 1: 'b'}` – Tobias Kienzler May 29 '13 at 07:38
  • No, it doesn't for me. – lifebalance Mar 29 '15 at 15:24
  • 4
    Slight modification: `dict(map(reversed, a_dict.items()))`. – 0xc0de Jul 23 '15 at 15:55
  • 2
    This fails for `d={'a':1,'b':2, 'c': 1}`. – Basj Jan 27 '16 at 14:44
  • @0xc0de Nope, afterwards `d[1] = 'a'` overwrites the original `d[1] = 'b'` – Tobias Kienzler Mar 10 '16 at 06:48
  • 20
    **Adding reverse mappings to the original dictionary is an awful idea.** As the above comments demonstrate, doing so is _not_ safe in the general case. Just maintain two separate dictionaries. Since the first two lines of this answer ignoring the trailing `d.update(revd)` are great, however, I'm still contemplating an upvote. _Let's give this some thought._ – Cecil Curry Jun 28 '16 at 02:10
  • 2
    the fact that it fails in the above cases is not related to the implementation. It is implicit that in a bidirectional structure as described in question, you cannot have duplicate values/keys (as they are accessed in same way). Keeping two dictionaries (in any form) corresponds to the limitation of not having duplicate values or duplicate keys (but you can have same values in both keys and values). In this case `={'a':1, 'b':2, 1: 'b'}` succeeds, while `d={'a':1,'b':2, 'c': 1}` still fails, because not a matter of implementation. – Vincenzooo Jan 08 '20 at 11:12
  • In many cases, keys and values reside in different namespaces. Then, this is a good solution. One just has to be aware of that. – Torsten Bronger Jun 03 '20 at 05:15
  • 1
    Idea is awful ONLY IF your dictionary has such cases. If one is looking for such data structure s/he already knows the limitations and the data probably doesn't have such cases. Searching for perfection will only waste your time if such simple solutions suffice. – alercelik Feb 11 '21 at 11:55
  • 1
    A clearer way of writing this would use a dictionary comprehension: `revd = {j: i for i, j in d.items()}` – Pradhyum R Jun 08 '21 at 01:56
44

A poor man's bidirectional hash table would be to use just two dictionaries (these are highly tuned datastructures already).

There is also a bidict package on the index:

The source for bidict can be found on github:

mlissner
  • 17,359
  • 18
  • 106
  • 169
miku
  • 181,842
  • 47
  • 306
  • 310
  • 1
    2 dicts requires double inserts and deletes. – Juanjo Conti Jul 23 '10 at 13:46
  • @Robuts, didn't understand you. – Juanjo Conti Jul 23 '10 at 13:46
  • 15
    @Juanjo: nearly any bidirectional/reversible hash table will involve "double inserts and deletes", either as part of implementing the structure, or as part of using it. Keeping two indexes is really the only fast way to do it, AFAIK. – Walter Mundt Jul 23 '10 at 13:53
  • 8
    Of course; I meant that take care of the 2 index by hand is the problem. – Juanjo Conti Jul 23 '10 at 14:21
  • No, `bidict` is not really usable : Try : `from bidict import bidict` `mybidict = bidict({'a': 'value1', 'b': 'value1'})` `print mybidict['a']` => this leads to a `KeyError` because duplicate values are not accepted :( – Basj Feb 19 '14 at 22:06
  • 2
    @Basj I think it's correct that it's not accepted since having more than one value means it's not a bijection anymore and is ambiguous for the reverse lookup. – user193130 Dec 11 '14 at 05:26
  • I don't really agree @user193130 : I think limiting the usage of bidict to injections and bijections is a bit too restrictive... Here in my previous example, using math notations, we could see `f^{-1}('value1')` as the set `{a, b}`. – Basj Dec 11 '14 at 08:00
  • 1
    @Basj Well, I can understand that there would be use cases which would be useful to have more than one value per key, so maybe this type of data structure should exist as a subclass of bidict. However, since a normal dict maps to a single object, I think it makes much more sense for the reverse to be the same as well. (Just to clarify, although the value can be a collection too, I meant that the key of the first dict should be of the same type as the value of the reverse dict) – user193130 Dec 11 '14 at 16:48
  • the link is outdated :( – tscizzle Feb 05 '15 at 21:53
  • @tscizzle https://pypi.python.org/pypi/bidict should work as should `pip install bidict`, https://github.com/jab/bidict, and https://bidict.readthedocs.org. – user161642 Mar 04 '15 at 06:09
  • A one to one datastructure or (bimap) would be as described here: https://en.wikipedia.org/wiki/Bidirectional_map Hence a 1 value maps to a key. And 1 key maps to a value. Making this less strict, one can extend this to allow multiple keys pointing to 1 value, but 1 value can only point to 1 key. And then loosening up even more would allow multikeys to multivalues. An example of the latter is to use 2 dicts mapping keys/values to sets of values/keys. – CMCDragonkai Dec 13 '17 at 06:18
  • This is what I did. I have no inserts or deletes, so it's easy enough to make two. – Roko Mijic Apr 17 '20 at 09:38
8

The below snippet of code implements an invertible (bijective) map:

class BijectionError(Exception):
    """Must set a unique value in a BijectiveMap."""

    def __init__(self, value):
        self.value = value
        msg = 'The value "{}" is already in the mapping.'
        super().__init__(msg.format(value))


class BijectiveMap(dict):
    """Invertible map."""

    def __init__(self, inverse=None):
        if inverse is None:
            inverse = self.__class__(inverse=self)
        self.inverse = inverse

    def __setitem__(self, key, value):
        if value in self.inverse:
            raise BijectionError(value)

        self.inverse._set_item(value, key)
        self._set_item(key, value)

    def __delitem__(self, key):
        self.inverse._del_item(self[key])
        self._del_item(key)

    def _del_item(self, key):
        super().__delitem__(key)

    def _set_item(self, key, value):
        super().__setitem__(key, value)

The advantage of this implementation is that the inverse attribute of a BijectiveMap is again a BijectiveMap. Therefore you can do things like:

>>> foo = BijectiveMap()
>>> foo['steve'] = 42
>>> foo.inverse
{42: 'steve'}
>>> foo.inverse.inverse
{'steve': 42}
>>> foo.inverse.inverse is foo
True
jme
  • 19,895
  • 6
  • 41
  • 39
1

Something like this, maybe:

import itertools

class BidirDict(dict):
    def __init__(self, iterable=(), **kwargs):
        self.update(iterable, **kwargs)
    def update(self, iterable=(), **kwargs):
        if hasattr(iterable, 'iteritems'):
            iterable = iterable.iteritems()
        for (key, value) in itertools.chain(iterable, kwargs.iteritems()):
            self[key] = value
    def __setitem__(self, key, value):
        if key in self:
            del self[key]
        if value in self:
            del self[value]
        dict.__setitem__(self, key, value)
        dict.__setitem__(self, value, key)
    def __delitem__(self, key):
        value = self[key]
        dict.__delitem__(self, key)
        dict.__delitem__(self, value)
    def __repr__(self):
        return '%s(%s)' % (type(self).__name__, dict.__repr__(self))

You have to decide what you want to happen if more than one key has a given value; the bidirectionality of a given pair could easily be clobbered by some later pair you inserted. I implemented one possible choice.


Example :

bd = BidirDict({'a': 'myvalue1', 'b': 'myvalue2', 'c': 'myvalue2'})
print bd['myvalue1']   # a
print bd['myvalue2']   # b        
Basj
  • 41,386
  • 99
  • 383
  • 673
Matt Anderson
  • 19,311
  • 11
  • 41
  • 57
  • 1
    I'm not sure if this is a problem, but using the above implementation, wouldn't there be issues if the keys and values overlapped? So `dict([('a', 'b'), ('b', 'c')]); dict['b']` -> `'c'` instead of the key `'a'`. – tgray Jul 23 '10 at 15:14
  • 1
    It's not an issue for the OP's example, but might be a good disclaimer to include. – tgray Jul 23 '10 at 15:17
  • How can we do that `print bd['myvalue2']` answers `b, c` (or `[b, c]`, or `(b, c)`, or anything else) ? – Basj Feb 19 '14 at 17:39
1

First, you have to make sure the key to value mapping is one to one, otherwise, it is not possible to build a bidirectional map.

Second, how large is the dataset? If there is not much data, just use 2 separate maps, and update both of them when updating. Or better, use an existing solution like Bidict, which is just a wrapper of 2 dicts, with updating/deletion built in.

But if the dataset is large, and maintaining 2 dicts is not desirable:

  • If both key and value are numeric, consider the possibility of using Interpolation to approximate the mapping. If the vast majority of the key-value pairs can be covered by the mapping function (and its
    reverse function), then you only need to record the outliers in maps.

  • If most of access is uni-directional (key->value), then it is totally ok to build the reverse map incrementally, to trade time for
    space.

Code:

d = {1: "one", 2: "two" }
reverse = {}

def get_key_by_value(v):
    if v not in reverse:
        for _k, _v in d.items():
           if _v == v:
               reverse[_v] = _k
               break
    return reverse[v]
NeoWang
  • 17,361
  • 24
  • 78
  • 126
1

a better way is convert the dictionary to a list of tuples then sort on a specific tuple field

def convert_to_list(dictionary):
    list_of_tuples = []
    for key, value in dictionary.items():
        list_of_tuples.append((key, value))
    return list_of_tuples

def sort_list(list_of_tuples, field):
     return sorted(list_of_tuples, key=lambda x: x[field])

dictionary = {'a': 9, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
list_of_tuples = convert_to_list(dictionary)
print(sort_list(list_of_tuples, 1))

output

[('b', 2), ('c', 3), ('d', 4), ('e', 5), ('a', 9)]
Golden Lion
  • 3,840
  • 2
  • 26
  • 35
0

Unfortunately, the highest rated answer, bidict does not work.

There are three options:

  1. Subclass dict: You can create a subclass of dict, but beware. You need to write custom implementations ofupdate, pop, initializer, setdefault. The dict implementations do not call __setitem__. This is why the highest rated answer has issues.

  2. Inherit from UserDict: This is just like a dict, except all the routines are made to call correctly. It uses a dict under the hood, in an item called data. You can read the Python Documentation, or use a simple implementation of a by directional list that works in Python 3. Sorry for not including it verbatim: I'm unsure of its copyright.

  3. Inherit from Abstract Base Classes: Inheriting from collections.abc will help you get all the correct protocols and implementations for a new class. This is overkill for a bidirectional dictionary, unless it can also encrypt and cache to a database.

TL;DR -- Use this for your code. Read Trey Hunner's article for details.

Charles Merriam
  • 19,908
  • 6
  • 73
  • 83
  • 1
    Nice article. Still, what exactly does not work with `bidict`? – tejasvi88 Jul 11 '21 at 12:54
  • What did not work two years ago may or may not work now. – Charles Merriam Jul 13 '21 at 17:45
  • 2
    It did work 2 years ago, it worked with Python 2.x, and it works with Python 3.10 now. Rather than distract with a mistake from whatever testing you did in 2020, it'd be better to focus on what your advice offers which is different since the mention of `UserDict`, etc. is useful and might point someone with more advanced needs in the right direction. – Chris Adams Apr 26 '22 at 15:00