1

I am using Python2.7 to write Set Data Struction. But I got confused to when using Dict.get. Here is my code:

#!/usr/bin/env python
# -*- coding:UTF-8
__author__ = 'shenshijun'


class Node(object):
    def __init__(self, key, weight):
        """Constructor for """
        self.key = key
        self.weight = weight

    def __cmp__(self, other):
        return cmp(self.key, other.key)

    def __str__(self):
        return "".join(['Node(key=', str(self.key), ',weight=', str(self.weight), ")"])


__dict = {}
a1 = Node('A', 1)
a2 = Node('A', 2)
__dict[a1] = a1
print a1 == a2
print(__dict.get(a2))

the output of this code is the following:

True None

So I guess that Python use is operator to compare the keys when searching. Can someone one find the truth?

The Python I used:

2.7.5 (default, Mar  9 2014, 22:15:05) 
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]

I have just solve my problem, but any suggestions about how to implements __hash__ in this situation?

ssj
  • 1,737
  • 2
  • 16
  • 28
  • 2
    Not 100% sure, but `dict` is a hash map, so it uses the hash code for storing the items into "buckets", and since you do not implement `__hash__`, it uses the default hashcode, i.e. the memory address. And since the hashed do not match, there's no reason to check for equality. – tobias_k Dec 19 '14 at 14:54
  • 2
    dicts use `__hash__`, `==` uses `__eq__` – acushner Dec 19 '14 at 14:55

2 Answers2

2

you have an equality on a1 and a2, but they are not the same object, indeed the reference has changed, so then

print a1 is a2  # False
print(__dict.get(a1)) # Node(key=A,weight=1)
print(__dict.get(a2)) # None

a1 is not the same object as a2, the key to a1 references the node, but not a2.

EDIT

As @tobias_k says, you can override the __hash__ function to do a different way to store hash references instead the memory address, for example :

class Node(object):
    def __init__(self, key, weight):
        """Constructor for """
        self.key = key
        self.weight = weight

    def __cmp__(self, other):
        return cmp(self.key, other.key)

    def __str__(self):
        return "".join(['Node(key=', str(self.key), ',weight=', str(self.weight), ")"])

    def __hash__(self):
        return ord(self.key)

__dict = {}
a1 = Node('A', 1)
a2 = Node('A', 2)
__dict[a1] = a1
print a1 == a2 # True
print a1 is a2 # False 
print(__dict.get(a1)) # Node(key=A,weight=1)
print(__dict.get(a2)) # Node(key=A,weight=1)
markcial
  • 9,041
  • 4
  • 31
  • 41
  • the `ord` function used in your version of `__hash__` is nowhere to find in Python builtins. Where did you get it? – ssj Dec 19 '14 at 15:06
  • "you can override the `__hash__` function to do a different way to store hash references" I think this is an understatement. This is not a different, but _the right_ way to use a hashmap. Also, whenever you implement `__eq__` or `__cmp__` you _should_ also implement `__hash__` or all sorts of strange things will happen -- as in OP's question. – tobias_k Dec 19 '14 at 15:14
  • I am sorry, maybe it is because english is not my native language. But I am not very sure if i got this point right, i can't see anything wrong about my sentence, or perhaps were you doing some kind of precission i am missing, if it is so, then thanks :) – markcial Dec 19 '14 at 15:22
2

When you look up a key in a dictionary, it does not right away test for equality. At first, it checks the hash code, to determine the "bucket" where the item should be if it were in the dictionary, and only then it checks for equality.

And since you did not implement a __hash__ method, it uses the default hash code, i.e. the memory address IIRC, so it never checks for equality in the first place. Add this to your class, then it will work

def __hash__(self):
    return hash(self.key)

Output:

True
Node(key=A,weight=1)

You can also add some print statements to your __hash__ and __cmp__ functions to see when and in what order those are called.


Addendum: It might be worth noting that, while not mandated by the programming language, you should always implement __hash__ when you implement __eq__ or __cmp__, and in such a way that whenever two objects are equal, they also have the same hash code (the converse is not necessary).

For more information, see this related question (for Java, but just as valid for Python).

Community
  • 1
  • 1
tobias_k
  • 81,265
  • 12
  • 120
  • 179