63

Why does "hello" is "hello" produce True in Python?

I read the following here:

If two string literals are equal, they have been put to same memory location. A string is an immutable entity. No harm can be done.

So there is one and only one place in memory for every Python string? Sounds pretty strange. What's going on here?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Deniz Dogan
  • 25,711
  • 35
  • 110
  • 162
  • Also have a look at the `id` function for checking memory locations: `print id("hello")` – Blixt Sep 08 '09 at 07:22
  • bzlm, the pyref.infogami.com/intern link has gone dead, but archive.org has a copy here:
    http://web.archive.org/web/20090429040354/http://pyref.infogami.com/intern
    However, though it's often true, it's NOT ALWAYS true, as @bobince demonstrated very well below.
    – Dave Burton Aug 11 '12 at 20:15

7 Answers7

95

Python (like Java, C, C++, .NET) uses string pooling / interning. The interpreter realises that "hello" is the same as "hello", so it optimizes and uses the same location in memory.

Another goodie: "hell" + "o" is "hello" ==> True

Ronan Boiteau
  • 9,608
  • 6
  • 34
  • 56
carl
  • 49,756
  • 17
  • 74
  • 82
  • 26
    Even C/C++ usually do this; "foo" == "foo" is often true in C. In both C and Python, this is an implementation detail; I don't think anything in Python *requires* that the interpreter do this, and in C/C++ this is an optimization that not all compilers do and it which can be disabled. (By contrast, this property is *always* true in Lua; all strings are interned.) – Glenn Maynard Sep 08 '09 at 07:33
  • 2
    @Glenn, you're correct and I'm glad someone mentioned. Certainly no one should RELY on this being true. – Kenan Banks Sep 09 '09 at 21:33
  • It is an interpreter or compiler for languages like c/C++ specific job to do this optimization by making compile time determined strings the same. – andy Dec 13 '14 at 07:33
  • 1
    In this specific case, the objects are the same because *the two literals in the same expression match and result in a single constant stored in the code*. If you used `a = 'hell' + 'o!'` and `b = 'hello!'` on separate lines in the interactive shell, `a is b` is going to be false. `a = 'hell' + 'o' and `b = 'hello'` does trigger interning, so it'll be true. But put the two examples into a function, and you'll have identical objects again. There are *multiple paths to object reuse* and they invariably are the result of optimisations. Don't rely on implementation details like these. – Martijn Pieters Apr 26 '18 at 08:04
66

So there is one and only one place in memory for every Python string?

No, only ones the interpreter has decided to optimise, which is a decision based on a policy that isn't part of the language specification and which may change in different CPython versions.

eg. on my install (2.6.2 Linux):

>>> 'X'*10 is 'X'*10
True
>>> 'X'*30 is 'X'*30
False

similarly for ints:

>>> 2**8 is 2**8
True
>>> 2**9 is 2**9
False

So don't rely on 'string' is 'string': even just looking at the C implementation it isn't safe.

bobince
  • 528,062
  • 107
  • 651
  • 834
  • 15
    Thus, you should always use `==` for string equality comparisons. – SingleNegationElimination Sep 08 '09 at 17:27
  • Interpreter caches small integers(upto 256) in Python. So, `a = 50; b = 50; a is b` is True, `a = 500; b = 500; a is b` is False. – Darshan Chaudhary May 05 '16 at 17:26
  • @DarshanChaudhary: the latter expression is actually *True*, because you put all your assignments one one line. `500` is a literal that's stored as a constant in the code object, and both `a` and `b` are assigned that one constant... Again, implementation detail, don't count on it. – Martijn Pieters Apr 26 '18 at 08:05
13

Literal strings are probably grouped based on their hash or something similar. Two of the same literal strings will be stored in the same memory, and any references both refer to that.

 Memory        Code
-------
|          myLine = "hello"
|        /
|hello  <
|        \
|          myLine = "hello"
-------
Quantumplation
  • 1,069
  • 1
  • 9
  • 19
6

The is operator returns true if both arguments are the same object. Your result is a consequence of this, and the quoted bit.

In the case of string literals, these are interned, meaning they are compared to known strings. If an identical string is already known, the literal takes that value, instead of an alternative one. Thus, they become the same object, and the expression is true.

SingleNegationElimination
  • 151,563
  • 33
  • 264
  • 304
2

The Python interpreter/compiler parses the string literals, i.e. the quoted list of characters. When it does this, it can detect "I've seen this string before", and use the same representation as last time. It can do this since it knows that strings defined in this way cannot be changed.

unwind
  • 391,730
  • 64
  • 469
  • 606
1

Why is it strange. If the string is immutable it makes a lot of sense to only store it once. .NET has the same behavior.

Brian Rasmussen
  • 114,645
  • 34
  • 221
  • 317
  • 1
    How is string interning related to immutability? Many things in both Python and ".NET" are immutable without being interned. – bzlm Sep 08 '09 at 07:16
  • 2
    Because if it were possible for a string literal to change in memory, it couldn't be shared (or "interned"). – harto Sep 08 '09 at 07:19
  • True, but given the fact the object is immutable allows safe sharing of the reference to the instance. – Brian Rasmussen Sep 08 '09 at 07:21
0

I think if any two variables (not just strings) contain the same value, the value will be stored only once not twice and both the variables will point to the same location. This saves memory.

  • Not true! It regards only strings and small integers. When you make a copy of a list or dictionary, for example, although they have the same value (== equality) they are not the same object ("is" equality). That is why you can change the copy of the list as the original stays unchanged (or vice versa). The great explanation is provided in Dynamic Typing chapter of Learning Python by O'reilly – fanny Dec 15 '17 at 01:15