Well there is a reason why modifying a string isn't goint to modify the second one.
Strings in python are immutable.
It's not exactly that strings are cached in python but the fact is that you can't change them. The python interpreter is able to optimize somewhat and reference two names to the same id.
In python, you're never actually editing a string directly. Look at this:
a = "fun"
a.capitalize()
print a
>> fun
The capitalize function will create a capitalized version of a
but won't change a
. One example is str.replace
. As you probably already noticed, to change a string using replace, you'll have to do something like this:
a = "fun"
a = a.replace("u", "a")
print a
>> fan
What you see here is that the name a
is being affected a pointer to "fun". On the second line, we're affecting a new id to a
and the old a
might get removed by the gc if there is no similar string.
What you have to understand is that since strings are immutable, python can safely have strings pointing to the same id. Since the string will never get modified. You cannot have a string that will get modified implicitely.
Also, you'll see that some other types like numbers are also immutable and will the same behaviour with ids. But don't be fooled by ids, because for some reason that I can't explain.
Any number bigger than 256 will receive different ids even though they point to the same value. And if I'm not mistaken, with bigger string the ids will be different too.
Note:
The id thing might also have different values when code is being evaluated inside a repl or a program itself. I remember there is a thing with code being optimized with code blocks. Which means that executing the code on different lines might be enough to prevent optimizations.
Here's an example in the REPL:
>>> a = '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'; b = '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'
>>> id(a), id(b)
(4561897488, 4561897488)
>>> a = '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'
>>> b = '[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]'
>>> id(a), id(b)
(4561897416, 4561897632)
With numbers:
>>> a = 100000
>>> b = 100000
>>> id(a), id(b)
(140533800516256, 140533800516304)
>>> a = 100000; b = 100000
>>> id(a), id(b)
(140533800516232, 140533800516232)
But executing the file as a python script will print because it executes the lines in the same code block (as far as I understand)
4406456232 4406456232
4406456232 4406456232
140219722644160 140219722644160