11

I have a Person class like this:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __repr__(self):
        return '<Person {}>'.format(self.name)

I want to add some instances of this class to a set, like this:

tom = Person('tom', 18)
mary = Person('mary', 22)
mary2 = Person('mary2', 22)

person_set = {tom, mary, mary2}
print(person_set)
# output: {<Person tom>, <Person mary>, <Person mary2>}

As you can see, there are 2 Marys in the set. How can I make it so that Person instances with the same age are considered the same person, and only added to the set once?

In other words, how can I get a result of {<Person tom>, <Person mary>}?

Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
sashimi
  • 793
  • 2
  • 8
  • 14

1 Answers1

41

When a new object is being added to a python set, the hash code of the object is first computed and then, if one or more objects with the same hash code is/are already in the set, these objects are tested for equality with the new object.

The upshot of this is that you need to implement the __hash__(...) and __eq__(...) methods on your class. For example:

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __eq__(self, other):
        return self.age == other.age

    def __hash__(self):
        return hash(self.age)

    def __repr__(self):
        return '<Person {}>'.format(self.name)

tom = Person('tom', 18)
mary = Person('mary', 22)
mary2 = Person('mary2', 22)

person_set = {tom, mary, mary2}
print(person_set)
# output: {<Person tom>, <Person mary>}

However, you should think very carefully about what the correct implementation of __hash__ and __eq__ should be for your class. The above example works, but is non-sensical (e.g. in that both __hash__ and __eq__ are defined only in terms of age).

Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
srgerg
  • 18,719
  • 4
  • 57
  • 39
  • thanks~this is what I want,I know it is weird.the class like **Person** which I will use is very simple.so it doesn't matter wheather overwrite or not.I know it's undesirable ^_^,I will use it carefully – sashimi May 11 '12 at 08:18
  • 1
    Why does python even require specifying the `__eq__` method, if it's already doing the comparison of the value specified in `__hash__` ? – wesinat0r Jun 16 '20 at 15:48
  • 1
    I answered my own comment - https://docs.python.org/3/glossary.html#term-hashable `__hash__` is for the lifetime value of the "object", `__eq__` is for comparing with other objects. You still need both to do comparison – wesinat0r Jun 16 '20 at 18:02
  • 1
    https://hynek.me/articles/hashes-and-equality/ is a good article for more background – wesinat0r Jun 17 '20 at 18:48