0

First, I am a complete newbie regarding to contributing to public projects and/or forums ans I know that this question is relatively "opinion-based" but I do not know where I should be posting the issue and I feel that it might be useful to other people.

While writing the tests of some other code for python 3.7.3 I run into a situation in which a key of a Counter was not in the Counter nor in the keys.

The following code reproduces and points towards the problem:

class MyHashable(object): 
    def __init__(self,label,key='something'):
        self.label = label
        self.key = key
    def __hash__(self):
        return hash((self.label,self.key))
    def __eq__(self,other):
        return (self.label,self.key) == (other.label,other.key)

TestDict = dict()

A = MyHashable('label1')
B = MyHashable('label2') 

B.key = 'something else' # Changing the hash of B

TestDict[A] = 12
TestDict[B] = 'ASDR'

A.key = 24               # Changing the hash of A 

Case1 = A in TestDict                  # False
Case2 = B in TestDict                  # True
Case3 = A in TestDict.keys()           # False
Case4 = B in TestDict.keys()           # True
Case5 = A in tuple(TestDict.keys())    # True
Case6 = B in tuple(TestDict.keys())    # True

I think that Case3 is not the root cause of Case1 as Case3 evaluates as True in python 2.7.17. (remember that Case 5 and 6 for python2.7 should be "==" instead of "in" statements). I would guess that the root cause is related with the underlying c code of the std library, but that is irrelevant unless the answer of the question of the title is yes.

I believe that the error is clearly mine for modifying a property that affects the hash call after including it in the dictionary yet I feel the need to point it out somewhere. I think it is definitely useful that this appears somewhere so that people that run into this situation may find it relatively fast. Also I think it is a nice example of how not to create a hashable class or at least a nice example of a bad usage of a hashable class in python.

Yes, I know that of this kind of situation in a dictionary is unlikely and smells completely like the programmers fault. But in other mappables such as collections.Counter the programmer is likely to be interested only in counting stuff. The Mapping Types documentation of python 3.9.1 states:

A mapping object maps hashable values to arbitrary objects

Therefore the programmer will make the class of the objects (that the programmer wants to count) hashable (no more than 4 lines in this example), and move on to more important problems. In the end the programmer will encounter a weird looking error where an item that was added to the Counter, is not in the Counter.

Getting to the point. Is this kind of behavior a bug? Should it be filed as a bug to the python core developers or somewhere else?

*Edit: fixed bug in the example code

rperezsoto
  • 66
  • 3

1 Answers1

1

Good implementation and finding though, but its neither a problem or bug with python but it is excepted behaviour.
When you are changing the key the value of hash is changing and that's why it can't be found in the dict and is excepted behaviour.

Have a look at this SO answer https://stackoverflow.com/questions/2671376/hashable-immutable#:~:text=In%20Python%20they're%20mostly,unusable%20as%20a%20dict%20key.

It is also mentioned in one of the comment the same is applicable/found in other languages like Java.
Java where a HashMap becomes broken if you modify an object used as a key in it: neither old nor new key can be found, even though if you print the map, it can be seen there.

A == tuple(TestDict.keys())[0] # True

is returning True because it is doing a string comparison instead of the object value comparison.

I got something like this

A : <__main__.MyHashable at 0x7fedfe499950>
tuple(TestDict.keys())[0]: <__main__.MyHashable at 0x7fedfe499950>

Have a look at this thread too Create a dictionary in python which is indexed by lists for more info on the mutable objects.

  • Thanks for the kind answer. The discussions in the links are really interesting regarding the good practices. I saw thanks to your comment that the case 5 and 6 missed what I consider the inconsistent behavior and just edited it. The point I wanted to make is that Case 1, Case 3 and Case 5 in theory are testing the same. Or at least Case 3 and Case 5. That Case 3 and Case 5 behave differently is what I consider buggish. – rperezsoto Jan 20 '21 at 16:55