0

I don't understand behaviour of this code:

a = hash((10,)), hash((10,))
# a is 3430012387537 is 3430012387537  Note it can be different for you but they are same integers

a[0] is a[1]
# Gives False


3430012387537 is 3430012387537
# Gives True

Why first test gives False while second gives True?

It seems to be caused by big integers because 10 works as expected:

a = 10, 10
a[0] is a[1]
# Gives True as expected

3 Answers3

7

You are observing implementation details of Python intepreter - that is why some people might not reproduce this issue.

Intepreter might decide to

  • create a new instance of int each time it sees an integer
  • keep some instances for common values (like -1, 0, .., 255)
  • create a new instance each time it sees an integer and then keep it (forever or some time)
  • any combination of mentioned above

You should never rely on intepreter implementation details, so you should never rely on x is y for integers.

Bartosz Marcinkowski
  • 6,651
  • 4
  • 39
  • 69
5

You are testing the hash values for tuples. If two objects are a) equal and b) support hashing then their hash value must be the same.

This doesn't mean that they are the same object, just that they have the same hash value. Nothing more, nothing less. The hash value doesn't even have to be unique, and two objects with the same hash value don't have to be equal either; the property is not transitive.

From the __hash__ method documentation:

The only required property is that objects which compare equal have the same hash value; it is advised to somehow mix together (e.g. using exclusive or) the hash values for the components of the object that also play a part in comparison of objects.

Next, you are doing something different entirely, you are comparing if two literals in the same expression are the same. Sure they are, the compiler in this case doesn't waste memory by creating two separate objects:

>>> import dis
>>> dis.dis(compile('3430012387537 is 3430012387537', '<stdin>', 'exec'))
  1           0 LOAD_CONST               0 (3430012387537)
              3 LOAD_CONST               0 (3430012387537)
              6 COMPARE_OP               8 (is)
              9 POP_TOP
             10 LOAD_CONST               1 (None)
             13 RETURN_VALUE

The LOAD_CONST bytecode loads the same object (the constant at index 0).

That doesn't mean that every expression in the Python interpreter will reuse constants, just everything that is compiled in one block is treated this way. That means that literals in the same function could also end up reusing the same constant object though.

Bottom line is: don't use is when you really wanted to use ==; Python may or may not optimise memory and execution speed by reusing the same object, but not everything that is equal is going to be the same object at all times. Reserve is for singletons only (such as None, type objects, and explicitly created single instances).

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • I'm not sure I understand completely, `type(hash((10,)))` is `int`, and it's the same `int` so why are they different objects. They may not be literals, but if I define a function `def f(a): return a * 3` and then set `a = f(3), f(3)`, I will still see `a[0] is a[1]`, and `id(a[0]) == id(a[1])`. – Mike Oct 30 '14 at 13:34
  • @Mike: it is the same **value**, just not the same instance of the `int` type. – Martijn Pieters Oct 30 '14 at 13:35
  • @Mike: if Python went around to make every possible integer interned, you'd run out of memory the moment you process enough different integers in your calculations. Even if you don't store those integers in variables. – Martijn Pieters Oct 30 '14 at 13:36
  • @Mike: so outside of a small number of integers that are almost guaranteed to be reused over and over again in Python programs, integer objects are separate instances, even if their values are the same. – Martijn Pieters Oct 30 '14 at 13:37
  • @Matijn Peters Ah, right you are! `def f(a): return a * 123456789` with `a = f(3), f(3)` and we get `a[0] is a[1]` evaluating to `False`. – Mike Oct 30 '14 at 13:39
2

All that is happening here is that your particular Python interpreter detects at compile time when the same integer literal occurs more than once. When this happens it only creates one integer object and re-uses it. Any way of calculating those numbers (could be hash, could simply be 123456789123456788+1) doesn't get the same optimisation so you see different values.

That's just for your interpreter of course, another interpreter might behave differently.

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> 123456789123456789 is 123456789123456788+1
False
>>> hash((10,)) is hash((10,))
False

Python 2.7.2 (341e1e3821ff, Jun 07 2012, 15:43:00)
[PyPy 1.9.0 with MSC v.1500 32 bit] on win32
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``the future has just begun''
>>>> 123456789123456789 is 123456789123456788+1
True
>>>> hash((10,)) is hash((10,))
True
Duncan
  • 92,073
  • 11
  • 122
  • 156
  • They aren't interned, but Pypy doesn't actually create Python objects for integers so equal integers always compare as identical. The `id` of an integer in at least this version of Pypy is just `8*n+1`. – Duncan Oct 30 '14 at 13:40
  • Ah, I missed that it was PyPy! – Martijn Pieters Oct 30 '14 at 13:41