0

I'd like to use instances of any type as a key in a single dict.

def add_to_dict(my_object, d, arbitrary_val = '123'):
    d[ id(my_object) ] = arbitrary_val

d = {}
add_to_dict('my_str', arbitrary_val)
add_to_dict(my_list, arbitrary_val)
add_to_dict(my_int, arbirtray_val)

my_object = myclass()
my_object.__hash__ = None
add_to_dict(my_object, arbitrary_val)

The above won't work because my_list and my_object can't be hashed.

My first thought was to just pass in the id value of the object using the id() function.

def add_to_dict(my_object, d, arbitrary_val = '123'):
    d[ id(my_object) ] = arbitrary_val

However, that won't work because id('some string') == id('some string') is not guaranteed to always be True.

My second thought was to test if the object has the __hash__ attribute. If it does, use the object, otherwise, use the id() value.

def add_to_dict(my_object, d, arbitrary_val = '123'):
    d[ my_object if my_object.__hash__ else id(my_object) ] = arbitrary_val

However, since hash() and id() both return int's, I believe I will eventually get a collision.

How can I write add_to_dict(obj, d) above to ensure that no matter what obj is (list, int, str, object, dict), it will correctly set the item in the dictionary and do so without collision?

martineau
  • 119,623
  • 25
  • 170
  • 301
Jesse Hogan
  • 3,075
  • 2
  • 16
  • 17
  • I'm not sure that it's a good idea. mutable objects _could_ be hashable, but the hash doesn't change, whereas the object _can_ change. for lists, it's very easy: convert to `tuple` and you're good to go. – Jean-François Fabre Sep 02 '17 at 20:01
  • read: https://stackoverflow.com/questions/24217647/why-must-dictionary-keys-be-immutable – Jean-François Fabre Sep 02 '17 at 20:03
  • also: https://stackoverflow.com/questions/4418741/im-able-to-use-a-mutable-object-as-a-dictionary-key-in-python-is-this-not-disa – Jean-François Fabre Sep 02 '17 at 20:04
  • one implementation here: http://code.activestate.com/recipes/578096-a-mutablemapping-that-can-use-unhashable-objects-a/ , but as predicted, it's very dangerous – Jean-François Fabre Sep 02 '17 at 20:06
  • 2
    Any hashable type can be a dict key already ... but if it's not hashable, then a dict makes absolutely no sense. Even if it "works" (air quotes), it won't work like a dict should. – Kenny Ostrom Sep 02 '17 at 20:07

2 Answers2

1

We could make some kind of dictionary that allows us to insert mutable objects as well:

class DictionaryMutable:

    nullobject = object()

    def __init__(self):
        self._inner_dic = {}
        self._inner_list = []

    def __getitem__(self, name):
        try:
            return self._inner_dic[name]
        except TypeError:
            for key, val in self._inner_list:
                if name == key:
                    return val
            raise KeyError(name)

    def __setitem__(self, name, value):
        try:
            self._inner_dic[name] = value
        except TypeError:
            for elm in self._inner_list:
                if name == elm[0]:
                    elm[1] = value
                    break
            else:
                self._inner_list.append([name,value])

    # ...

This works as follows: the DictionaryMutable consists out of a dictionary and a list. The dictionary contains the hashable immutable keys, the list contains sublists where each sublist contains two elements: a key and a value.

For each lookup we first attempt to perform a lookup on the dictionary, in case the key name is unhashable, a TypeError will be thrown. In that case we iterate through the list, check if one of the keys matches and return the corresponding value if it does. If no such element exists, we raise a KeyError.

Setting elements works approximately the same way: first we attempt to set the element in the dictionary. If it turns out the key is unhashable, we search linearly through the list and aim to add the element. If that fails, we add it at the end of the list.

This implementation has some major disadvantages:

  • if the dictionary lookup fails due to the key being unhashable, we will perform linear lookup, this can siginificantly slow down the lookup; and
  • if you alter an object that is in the dictionary, then the key will be updated, and thus a search for that object will fail. It thus can result in some unpredicted behavior.

This is only a basic implementation. For instance __iter__, etc. need to be implemented as well.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
1

Instead of the id() of the object, you could use the pickled byte stream representation of the object pickle.dumps() returns for it. pickle works with most built-in types, and there are ways to extend it to work with most values it doesn't know how to do automatically.

Note: I used the repr() of the object as its "arbitrary value" in an effort to make it easier to identify them in the output displayed.

try:
    import cpickle as pickle
except ModuleNotFoundError:
    import pickle
from pprint import pprint

def add_to_dict(d, obj, arbitrary_val='123'):
    d[pickle.dumps(obj)] = arbitrary_val

class MyClass: pass

my_string = 'spam'
my_list = [13, 'a']
my_int = 42
my_instance = MyClass()

d = {}
add_to_dict(d, my_string, repr(my_string))
add_to_dict(d, my_list, repr(my_list))
add_to_dict(d, my_int, repr(my_int))
add_to_dict(d, my_instance, repr(my_instance))

pprint(d)

Output:

{b'\x80\x03K*.': '42',
 b'\x80\x03X\x04\x00\x00\x00spamq\x00.': "'spam'",
 b'\x80\x03]q\x00(K\rX\x01\x00\x00\x00aq\x01e.': "[13, 'a']",
 b'\x80\x03c__main__\nMyClass\nq\x00)\x81q\x01.': '<__main__.MyClass object at '
                                                  '0x021C1630>'}
martineau
  • 119,623
  • 25
  • 170
  • 301