0

I've noticed that when you concatenate strings in python, sometimes the id() of the variable does not change, which means the str object changed.

Let me explain:

>>> a = 'a'
>>> id(a)
2844964047152
>>> a+='b'
>>> id(a)
2844971374576
>>> a+='c'
>>> id(a)
2844971374576

At first a is (...152) and concatenating 'b' creating a new 'ab' object (...576) which a will reference. This is the expected behavior.

But then, concatenating 'c' changes the same object ("...576") to 'abc'?

I'd much appreciate it if someone could explain this.

Edit:
this question was marked duplicate with Aren't Python strings immutable? Then why does a + " " + b work? although it's completely different

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
itay zohar
  • 177
  • 10
  • 2
    It's a new object, it just happens to be stored a the same place as the other one. `id`s are just guaranteed to be different for all different objects at a given time, not over time. – Thierry Lathuille Nov 27 '21 at 16:52
  • `id()` doesn’t have to change, and it doesn’t have to not change. TBH I’ve never seen any reason to concern myself with `id()` and it’s behaviour - it’s completely implementation-specific. – DisappointedByUnaccountableMod Nov 27 '21 at 16:53
  • 1
    @ThierryLathuille: It's not actually a new object (depending on how you define "new"); CPython has a special optimization that allows a `str` with no other references to be expanded in place (with `realloc`, which often avoids the need to construct a truly new object) since it doesn't violate *apparent* immutability. The first `+=` doesn't do it because `'a'` is (as an implementation detail) a cached singleton, but the `'ab'` is a unique reference, and `+= 'c'` operates in place. – ShadowRanger Nov 27 '21 at 17:04
  • The ID reffers to the memory address. I'm not sure about the inner works of Python and memory allocation, but I believe it must have something to do with the fact that they can allocate that much information in the same memory address, therefore only changing the object, but not its location on memory. – Luís Henrique Martins Nov 27 '21 at 17:06
  • 2
    The original duplicate was wrong, but this is a proper duplicate of [Does Python += operator make string mutable?](https://stackoverflow.com/q/48173141/364696) (which was also improperly duped to [Aren't Python strings immutable? Then why does a + " " + b work?](https://stackoverflow.com/q/9097994/364696)). – ShadowRanger Nov 27 '21 at 17:06
  • 1
    IDs have to be unique for objects that currently exist. They can be reused. In your code, the original `'a'` and `'ab'` string objects don't exist any more, since they have no names and are collected. So `'abc'` *may*, but *need not*, have the same ID that `'ab'` had. – Karl Knechtel Nov 27 '21 at 17:08
  • 1
    @KarlKnechtel: Technically, under normal circumstances, it would be impossible for this to happen; `a += b` is defined to be equivalent to `a = a.__iadd__(b)` (which becomes `a = a.__add__(b)` for immutable types like `str`), which means the new object constructed by the `__add__` exists just *before* `a` is rebound. CPython has an optimization for this case that recognizes the specific case of `a += b` or `a = a + b` for all `str` operands, and "cheats" to make it clear the destination *before* concatenating, and concatenation then detects the otherwise unreferenced object and mutate it. – ShadowRanger Nov 27 '21 at 17:13
  • I'm a little surprised that this is actually faster on average, given that the text buffer needs to be reallocated anyway, and given the necessary reference-count book-keeping. Interesting, though. – Karl Knechtel Nov 27 '21 at 17:16

0 Answers0