6

I have recently started to learn Python and have encountered something a little strange when playing with sets. The following code sample doesn't produce the expected results.

a_set = {True,2,3,4}
a_set.add(1)

I expected a_set to have the values {True, 1, 2, 3, 4} but instead this code produced {True, 2, 3, 4}.

Trying variations on this also produced the same results:

a_set = {1,2,3,4}
a_set.add(True)

Expected {True, 1, 2, 3, 4} Actual {1, 2, 3, 4}

Trying this with False and 0 obtained the same results:

a_set = {False,2,3,4}
a_set.add(0)

Expected {False, 0, 2, 3, 4} Actual {False, 2, 3, 4}

a_set = {0,2,3,4}
a_set.add(False)

Expected {False, 0, 2, 3, 4} Actual {0, 2, 3, 4}

I understand that the bool type is inherited from int and that True == 1 and False == 0 but was still a little surprised by the above results.

Does anybody know if this behaviour is by design? Also is it possible to have a set which contains both True, False, 0, and 1?

I did perform quite a bit of googling but was not able to find an answer to my questions.

Thanks in advance

UPDATE

In response to the comments below I agree that the following question partially answers my question.

Is False == 0 and True == 1 in Python an implementation detail or is it guaranteed by the language?

But I feel that it doesn't answer the query I have regarding the behaviour of sets and whether it is possible to have a set containing both True and 1. Even though bool is inherited from int, they are different types, so I found the fact that a set cannot distinguish between True and 1 to be a little confusing. So really this is a question about the behaviour of sets in Python not just about True == 1.

Community
  • 1
  • 1
mickfold
  • 2,003
  • 1
  • 14
  • 20
  • 1
    http://stackoverflow.com/a/6865824/846892 – Ashwini Chaudhary Aug 15 '13 at 17:20
  • @AshwiniChaudhary I saw that question that you referred to. While it did talk about `True == 1` and `False == 0`. It didn't answer why sets behaved the way they do. This is why I put in this separate question. – mickfold Aug 15 '13 at 17:35
  • @AshwiniChaudhary That's an answer from the creator of Python himself. I feel like I'm somehow obligated to make that explicit. – arshajii Aug 15 '13 at 17:39
  • Similary, `{1, 1.0}` will give you `{1}` -- even though the integer and the float are obviously different objects. – Gabe Aug 15 '13 at 17:42
  • 2
    @AshwiniChaudhary I disagree that's a duplicate. OP clearly knows that they're equal, and is asking specifically about sets – Wooble Aug 15 '13 at 23:27
  • I have updated this question to highlight the differences between my question and the [question](http://stackoverflow.com/questions/2764017/is-false-0-and-true-1-in-python-an-implementation-detail-or-is-it-guarante) referred to by @AshwiniChaudhary – mickfold Aug 16 '13 at 17:34
  • 2
    @mickfold Voted for re-open. – Ashwini Chaudhary Aug 16 '13 at 17:49
  • @AshwiniChaudhary Thanks for the re-open vote. I appreciate it. – mickfold Aug 17 '13 at 21:57

1 Answers1

8

For historical (hysterical?) reasons Python's bool type is a subclass of int, and True is equal to 1 and False is equal to 0.

They hash to the same location as well:

>>> True == 1
True
>>> hash(True) == hash(1)
True
>>> False == 0
True
>>> hash(False) == hash(0)
True

Since both True and 1 are considered equal and they hash to the same slot, both set and dict treat them as the same thing.

You'll see the same with a dictionary:

>>> True in {1: 'foo'}
True
>>> 1 in {True: 'foo'}
True

This behaviour extends to other numbers too; floating point values equal to integer values will show the same behaviour:

>>> {1.0, 1, 2.0, 2}
{1, 2}

but at least the reasons why that happens are a little more.. obvious.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Okay, so when adding to a dictionary or set the type of the object isn't checked. It's just the hash value that is important. Is what I am saying correct? – mickfold Aug 15 '13 at 17:27
  • @mickfold: Equality and hash are checked. The hash value is not meant to be unique, clashes can happen. The hash is used to pick an initial slot in the hash table, then is perturbed if there is an item that is *not* equal already there. – Martijn Pieters Aug 15 '13 at 17:28
  • @arshajii From what I understand, `set` uses the same hashmap pattern as `dict`. It follows from that that value is only checked if there's a key collision, so if the objects hash differently, value is completely irrelevant. Further, if this weren't the case, `i in set` would have the O(n) average performance of `i in list` rather than the O(1) that it does. – Silas Ray Aug 15 '13 at 17:36
  • @SilasRay Yes, that seems to correct. Hash does play a role here unlike I initially thought. – arshajii Aug 15 '13 at 17:37
  • Has anybody ever proposed an overloaded `__hash__` for `bool`? Seems like it would be a simple fix to the problem. – Mark Ransom Aug 16 '13 at 17:41
  • 1
    That would not work; quoting from the [`dict.__hash__()` docs](http://docs.python.org/2/reference/datamodel.html#object.__hash__): *The only required property is that objects which compare equal have the same hash value*. You'd have to break the equality too. – Martijn Pieters Aug 16 '13 at 18:14
  • If the hash values for `False` and `True` would differ from those for 0 and 1, but the objects were still equal, you'd get unexpected result when hash collisions force a boolean and equal int into the same slot. – Martijn Pieters Aug 16 '13 at 18:16
  • @MartijnPieters, at first I thought that was a silly requirement, but after a bit of thought I can see how different hashes could end up colliding in the same bucket with only equality to tell them apart. I wonder then why Python 3 didn't take the opportunity to split bool and int into separate types? Or maybe it did? – Mark Ransom Aug 16 '13 at 20:54
  • They didn't, but I don't know if it was considered and if so, what objections were found. – Martijn Pieters Aug 16 '13 at 21:02