16

I found a strange behavior with the semicolon ";" in Python.

>>> x=20000;y=20000
>>> x is y
True
>>> x=20000
>>> y=20000
>>> x is y
False
>>> x=20000;
>>> y=20000
>>> x is y
False

Why does the first test return "True", and the others return "False"? My Python version is 3.6.5.

Boann
  • 48,794
  • 16
  • 117
  • 146
longshuai ma
  • 203
  • 1
  • 6

2 Answers2

20

In the interactive interpreter, the first semi-colon line is read and evaluated in one pass. As such, the interpreter recognizes that 20000 is the same immutable int value in each assignment, and so can (it doesn't have to, but does) make x and y references to the same object.

The important point is that this is simply an optimization that the interactive interpreter chooses to make; it's not something guaranteed by the language or some special property of the ; that joins two statements into one.

In the following two examples, by the time y=20000 is read and evaluated, x=20000 (with or without the semi-colon) has already been evaluated and forgotten. Since 20000 isn't in the range (-5 to 257) of pre-allocated int values, CPython doesn't try to find another instance of 20000 already in memory; it just creates a new one for y.

kmario23
  • 57,311
  • 13
  • 161
  • 150
chepner
  • 497,756
  • 71
  • 530
  • 681
  • Thanks, i understand. then i also try `x,y=20000,20000`, the "x is y" returns true also. so i think "x=20000;y=20000" may optimiz to "x,y=20000,20000". – longshuai ma Sep 13 '18 at 18:26
  • Not directly, but again, because the commands are in one "batch" of input, the interpreter/compiler has more opportunity to look for optimizations. The tuple created on the right-hand side consists of the same literal repeated twice, so the tuple can be created with two references to the same object, rather than allocating two separate objects with the same value. – chepner Sep 13 '18 at 18:45
  • Also, note that the people that made the interpreter _could_ have chosen to make the interpreter reuse the `20000` object for the second statement; they just didn't. (And be sure you understand the difference between `is` and `==` - see Roberto Bonvallet's answer.) – Aasmund Eldhuset Sep 13 '18 at 19:34
  • To understand *why* 20000 isn't always reused, consider the difference between having 20000 appear twice in the same expression you are currently processing vs trying to determine if the literal 20000 was evaluated at some arbitrary time in the past. The first just requires looking at the expression currently being evaluated; the latter requires dedicated data structures in memory to track live objects. – chepner Sep 13 '18 at 19:43
2

The is operator checks whether two values are the same object in memory. It's not meant to be used for checking for equality. For what is worth, you could consider the fact that it sometimes returns True and sometimes False just to be a matter of luck (even if it isn't).

For example, the results are different in an interactive session and in a standalone program:

$ cat test.py
x = 200000; y = 200000
print(x is y)

xx = 200000
yy = 200000
print(xx is yy)

$ python test.py
True
True

Or you have this other example:

>>> x = 50 + 50; y = 50 + 50
>>> x is y
True
>>> x = 5000 + 5000; y = 5000 + 5000
>>> x is y
False

This happens because the interpreter caches small numbers so they are always the same object, but it doesn't for large numbers, so both additions in the second case create a new 10000 object. It has nothing to do with the semicolon.

Roberto Bonvallet
  • 31,943
  • 5
  • 40
  • 57
  • How come can one consider it to be a matter of luck while it isn't at all? – Işık Kaplan Sep 13 '18 at 18:22
  • and when it will return False? – mad_ Sep 13 '18 at 18:26
  • 1
    @IşıkKaplan You're right that it isn't. What I mean is that it's not a behavior you can rely on. In some sense it's a matter of luck that the interpreter chooses to the optimization that [Chepner mentioned](https://stackoverflow.com/a/52319500/13169), and also that the example was tested in the interactive console (I tried writing a standalone program and the results are different). – Roberto Bonvallet Sep 13 '18 at 18:35
  • @mad_ I expanded my answer with an example where two expressions with equal result separated by semicolons make the `is` operator return `False`. – Roberto Bonvallet Sep 13 '18 at 18:43
  • There isn't an inconsistency though, you may want to stay away from the immutable built-ins to don't think about interpreter optimizations that much but, for custom classes it is always accurate and something that can be relied upon. If not; I'd like to learn where it can return incorrect result in a custom class. – Işık Kaplan Sep 13 '18 at 18:46