Python: Literals: Evaluation of Literals

Question

From the Python documentation:

6.2.2. Literals
...
Multiple evaluations of literals with the same value (either the same occurrence in the program text or a different occurrence) may obtain the same object or a different object with the same value.

I need to understand this passage above but find it uneasy. I need your help to clarify certain things:

>>> " Got any rivers you think are uncrossable ?" #1
' Got any rivers you think are uncrossable ?' #2
>>> " Got any rivers you think are uncrossable ?" #3
' Got any rivers you think are uncrossable ?' #4

#1 and #3 are two literals with the same value that obtain the same objects #2 and #4 respectively. The value of these objects is the same. How could their evaluation obtain different objects with the same value? What has occurrence to do with the result of evaluation here?

either the same occurrence in the program text or a different occurrence

How to distinguish between the same occurrence and a different occurrence ? And what has the place of occurrence to do with the result of evaluation ?

You can imagine WHY this might happen. It makes sense to consolidate small strings, but with a large string, it might be cheaper to create a new object than to scan through the set of existing string literals looking for a match. As I said, it would be irrelevant to your code. — Tim Roberts, Oct 21 '21 at 20:24
Does this answer your question? [Python: Why does ("hello" is "hello") evaluate as True?](https://stackoverflow.com/questions/1392433/python-why-does-hello-is-hello-evaluate-as-true) — General4077, Oct 21 '21 at 20:33
See [Under which circumstances do equal strings share the same reference?](https://stackoverflow.com/q/11611750/3890632) — khelwood, Oct 21 '21 at 20:40

score 2 · Answer 1 · edited Oct 21 '21 at 20:32

2

Both are string objects and have different ids even if the value is the same. E.g.:

>>> id("My string")
140140708077104
>>> id("My string")
140140708085936
>>> my_first_string = "Repeated value"
>>> my_second_string = "Repeated value"
>>> my_first_string == my_second_string
True
>>> id(my_first_string) == id(my_second_string)
False

edited Oct 21 '21 at 20:32

khelwood

55,782
14
81
108

answered Oct 21 '21 at 20:26

Fausto Alonso

1,046
6
8

That's in an interactive session. Try it inside a script and see if you get the same result. – khelwood Oct 21 '21 at 20:26
2

This is called string pooling or string interning. See [this](https://stackoverflow.com/a/1392927/9224678) question/answer for some good discussion – General4077 Oct 21 '21 at 20:34
1

Totally right!! Thanks for the link with the explanation! – Fausto Alonso Oct 21 '21 at 20:39

Joshua Voskamp · Answer 2 · 2021-10-21T20:40:24.490

1

Compare these two:

>>> a = 'a'
>>> b = 'a'
>>> id(a)
1799740566768
>>> id(b)
1799740566768

vs

>>> a = 'a b'
>>> b = 'a b'
>>> id(a)
1799740710960
>>> id(b)
1799751802672

Internally, in the first example, a and b both refer to the same object, while in the second, they refer to different objects.

This can have some unexpected results: in the first case, a is b -> True but in the second, a is b -> False.

Why does it matter?

As remarked in a comment, you likely shouldn't need to care. In the case of immutable literals, it should make no difference functionally whether your code refers to the same object, or different objects, when accessing 'identical' literals. But, in the case of mutable objects, it is definitely important to be aware of possible side-effects.

For more info, see e.g. Under which circumstances do equal strings share the same reference? and What does sys.intern() do and when should it be used?

Edit: test results if you get the same object or different objects, using the below script:

if __name__=="__main__":
    for i in range(100):
        a = ''.join(['a']*i)
        b = ''.join(['a']*i)
        print(f'a: {a}\nb: {b}\na is b: {a is b}')

gives me the following results, running Python 3.9.7 in Windows PowerShell:

a:
b:
a is b: True
a: a
b: a
a is b: True
a: aa
b: aa
a is b: False
a: aaa
b: aaa
a is b: False
a: aaaa
b: aaaa
a is b: False
a: aaaaa
b: aaaaa
a is b: False
a: aaaaaa
b: aaaaaa
a is b: False
... etc.

edited Oct 21 '21 at 20:40

answered Oct 21 '21 at 20:29

Joshua Voskamp

1,855
1
10
13

Though the strings are separate objects inside an interactive session, that is not the typical behaviour when two identical string literals occur inside the same script. – khelwood Oct 21 '21 at 20:33
@khelwood source? – Joshua Voskamp Oct 21 '21 at 20:34
"if the same string literal occurs twice in the source code, they will end up pointing to the same string object" in [the answer](https://stackoverflow.com/a/11611774/3890632) to the first question you linked. Also, just try it. – khelwood Oct 21 '21 at 20:35
To follow up on the claim, in trying for myself, I'm not able to replicate 'same string literal pointing to same string object' using the test script in the question edit. – Joshua Voskamp Oct 21 '21 at 20:41
Your test is not testing the same thing: they're not string literals. – khelwood Oct 21 '21 at 20:42
In the first question I linked, I see "The details of when strings are cached and reused are implementation-dependent, can change from Python version to Python version and cannot be relied upon." -- i.e. results may vary, no? – Joshua Voskamp Oct 21 '21 at 20:42
Hence my saying "the *typical* behaviour". – khelwood Oct 21 '21 at 20:44
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/238414/discussion-between-joshua-voskamp-and-khelwood). – Joshua Voskamp Oct 21 '21 at 20:44

score 1 · Answer 3 · answered Oct 21 '21 at 20:49

The passage is referring to object identity vs. object value. The keyword is is used to compare to objects to be the same object; whereas the operator == is used to test for equality.

Two object instances can have the same value but be different objects.

Example for CPython, which caches certain integers as an internal optimization, but it is not guaranteed, so your result could be different:

>>> a = 999
>>> b = 999
>>> a == b       # equal
True
>>> a is b       # not the same object!
False
>>> a = 5
>>> b = 5
>>> a == b       # also equal
True
>>> a is b       # and happen to be the same object
True

This demonstrates the phrase "may obtain the same object or a different object with the same value."

Python: Literals: Evaluation of Literals

3 Answers3

Why does it matter?