0

I am writing a class that represents a hashable dictionary in Python, following what has been suggested in here: hashable dict. I am doing this because I need it to be hashable for other implementations.

So I basically created my nice HashableDict class:

class HashableBinnerDict(dict[KEY, VALUE]):
    """
    Class that represents a hashable dict.
    """

    def __hash__(self) -> int:
        return hash(frozenset(self))

It inherits from dict and KEY and VALUE are two generic datatypes needed to parametrize the typing of my HashableDict.

The overall code that leverages such class works perfectly. However, mypy is complaining with this error:

error: Signature of "__hash__" incompatible with supertype "dict"  [override]

And I guess it is caused by the fact that inside the "base" class dict of python there's not implemented hash function, in fact we have (extracted from the dict class in the Python codebase):

class dict(object):
    ...
    ... # skipping non relevant code

    def __sizeof__(self): # real signature unknown; restored from __doc__
        """ D.__sizeof__() -> size of D in memory, in bytes """
        pass

    __hash__ = None

The hash attribute is defined as None, so I guess mypy is complaining because of that. Any idea how to solve this besides brutally ignoring the error?

Mattia Surricchio
  • 1,362
  • 2
  • 21
  • 49
  • May I ask, why `def __hash__(self) -> Any`? `hash` is supposed to return an integer. Answer to your main question - no, you cannot, because parent class has `__hash__ = None` explicitly declared, and you assign a callable instead. It is definitely not a type problem, because nobody, I hope, will access `dict.__hash__` to get `None` (or will be punished for this weird thing) - so you can safely use an ignore comment. – STerliakov Jul 22 '22 at 19:52
  • (You're aware of bad consequences of hashable mutable containers, right?) – STerliakov Jul 22 '22 at 19:53
  • @SUTerliakov I am somewhat aware that this might not be the best practice, but if you can show me which could be the bad consequence, I would be more than happy :) – Mattia Surricchio Jul 22 '22 at 21:41
  • @SUTerliakov Yeah, the return type is a mistake of copy pasting, Im gonna fix it – Mattia Surricchio Jul 22 '22 at 21:42
  • The simplest bad consequence: https://gist.github.com/sterliakov/042bc5c3598217add267a5de497ba2b4 – STerliakov Jul 23 '22 at 00:14
  • I understood! However, this hashable dict type I defined is going to be an attribute of a frozen dataclass (thus I supposed It would be basically not mutable). Does this somehow limit the problem? Is it ok in this case to use such hashing "trick" on the dictionary? – Mattia Surricchio Jul 23 '22 at 08:17
  • 1
    If it is supposed to be not mutable, I'd suggest to override `__setitem__`, `__delitem__`, `pop` and other modifying methods to raise an exception - this will help to avoid accidental mistakes. But yes, *immutable* hashable mapping is absolutely fine, like `tuple` vs `list`. – STerliakov Jul 23 '22 at 16:47
  • Interesting suggestions! If you want, you can add your comments as answer and I will accept it – Mattia Surricchio Jul 23 '22 at 19:52

2 Answers2

0

In the end I found a better solution to my problem (thanks to my colleagues for the suggestion :D).

Implementing an hashable frozen dictionary is not a good idea, it is quite against the "core" of python.

There are some workarounds with MappingProxyType that allows you to generate a read only dictionary. However, MappingProxyType is not hashable and it is not possible to override / create the __hash__ method because such class is marked as final.

Overriding by hand all the possible methods of base python classes is quite tricky and really prone to errors: it is enough to forget to override one method that changes the dictionary and you're done.

I found a workaround to ensure the hashability and immutability of my attribute while preserving the "usability" of a dictionary.

I created my attribute as tuple[tuple[str, MyObject]] and then I have a cached_property that converts such tuple in a dictionary that I can use.

So long story short: try to avoid weird overrides on python types to force a "non-intended" behaviour, change approach and be compliant with the python "philosophy"

Mattia Surricchio
  • 1,362
  • 2
  • 21
  • 49
-1

This answer provides an excellent HashableDict implementation:

import collections

class FrozenDict(collections.Mapping):
    """Don't forget the docstrings!!"""
    
    def __init__(self, *args, **kwargs):
        self._d = dict(*args, **kwargs)
        self._hash = None

    def __iter__(self):
        return iter(self._d)

    def __len__(self):
        return len(self._d)

    def __getitem__(self, key):
        return self._d[key]

    def __hash__(self):
        if self._hash is None:
            hash_ = 0
            for pair in self.items():
                hash_ ^= hash(pair)
            self._hash = hash_
        return self._hash
Paweł Rubin
  • 2,030
  • 1
  • 14
  • 25