2

I was trying to figure out why the following 'x' and 'y' are different.

>>> x = 'a' 
>>> x += 'bc'
>>> x
'abc'
>>> y = 'abc'
>>> x is y
False
>>>

>>> id(x)
4537718624
>>> id(y)
4537059288
>>>

Why are the id's different? I am not looking for information about the 'is' operator. I am trying to figure out why the new object created after concatenation is differs from 'y'.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
James Sapam
  • 16,036
  • 12
  • 50
  • 73
  • @sundarnatarajСундар its different, here i am trying to figure out why the id's are different. – James Sapam Sep 10 '14 at 04:46
  • https://www.codementor.io/python-tutorial/stack-overflow-martijn-pieters-python-optimization check this link also – sundar nataraj Sep 10 '14 at 04:48
  • The concatenation creates a *new/different* string object, albeit one with the same value – user2864740 Sep 10 '14 at 04:50
  • i am not looking about 'is' operator, i am trying to figure out why the new object created after concatenation is differ from 'y'. – James Sapam Sep 10 '14 at 04:52
  • @sapam Because it is a *different* string. The only way that the *same* object could result from the concatenation over arbitrary strings is if Python ([see `intern` here](http://stackoverflow.com/a/1504870/2864740)) performed a lookup-by-equality into cached objects (eg. the interned string pool). It does *not* do this interning in general because such an operation adds overhead (the lookup and keeping the previous values strongly referenced). If you *need* identity then perform a manual lookup into a maintained pool (although `intern` is on rare occasion appropriate), else use equality. – user2864740 Sep 10 '14 at 04:54
  • I am unable to post answer since the question is closed. I understood your query and this is the answer. `x = "123" y = "123"` In this case, both `x` and `y` will have the same `id` and hence `x is y` will return `True` . This is because python string is immutable and hence x and y points to the same object. However, if you modify the string using concatenation as you have done, you have created a different string object. Now `x` and `y` are no more identical and hence `x is y` will return `False` – cppcoder Sep 10 '14 at 05:04
  • I'm not sure this is a dup. I think the OP is asking why Python doesn't intern equal strings so they're identical, not why non-identical strings don't match with `is`. @sapam: If I'm right, please edit the question to clarify that, in which case we can reopen the question, and user2864740 or cppcoder can write an answer that explains better than a comment can. – abarnert Sep 10 '14 at 05:05
  • @abarnert - I don't see how this isn't a dup. The answer is the same as for the linked question: they're different objects. As for why, it'll be because the interpreter has access to the set of interned strings, but the concat magic method doesn't. – sapi Sep 10 '14 at 05:17
  • @cppcoder: I think the flaw in your reasoning here is that you don't *modify* the string- strings are immutable, so even though they pretend to allow modification through `+=`, you're always getting a new, separate string. – Marius Sep 10 '14 at 05:20
  • @sapi: Nonsense. `x = 0; x += 1; y = 1; x is y` will give you `True`. The `int.__add__` method doesn't have access to the interned `int` table any more than `str.__add__` has access to the interned `str` table. It's just that CPython chooses to auto-intern some things (small ints, the stock singleton constants) and not others (larger ints, strings). – abarnert Sep 10 '14 at 05:24
  • @Marius If you read my statement, I have mentioned that. `if you modify the string using concatenation as you have done, you have created a different string object` – cppcoder Sep 10 '14 at 05:34
  • Ya this question may be duplicate but one is thing for sure, the comments in this question is much more in details then the answer in the original question. So, next time please wait for sometime before you mark any question as duplicate or what ever. And Thank you so much to zxq9, abarnert, cppcoder, sapi, Marius, user2864740 for your comments. – James Sapam Sep 10 '14 at 17:36

1 Answers1

6

is refers to identity, as in the identity of the object. == refers to equality, meaning do two (or the same) object(s) have the same value.

If I change the value of x, the value of y does not change, because they are not the same object, though they have the same value. If, on the other hand, I do

x = [1, 2, 3]
y = x

and then change something about either, I will be changing the underlying object that x and y are pointing to. These are labels (references) to underlying objects, not the objects themselves, and identity is not the same thing as value.

Edit:

Imagine we make a class called Person:

class Person(object):
    def __init__(name):
        self.name = name

There is more than one "Joe Smith" in the world. But they are not the same person. This is true in Python as well:

joe1 = Person("Joe Smith")
joe2 = Person("Joe Smith")

Their identities are different, because they are different objects, despite them carrying the same name value. We could create a comparison operator on them that checks if the name values are equivalent, so then joe1 == joe2 would be true, but joe1 is joe2 will never be the same.

This feature of Python is useful when you need to know if changing the state of an object is going to have consequences elsewhere. For example, if I pass a dictionary into a function, and change something about that dictionary it has changed everywhere. This is particularly important because Python passes function arguments around sometimes by value and sometimes by reference, and this can result in awkward to track down bugs (especially if you're new to Python):

>>> foo = {'bar': 'baz'}
>>> def changeit(z):
...     z['spam'] = 'eggs'
... 
>>> changeit(foo)
>>> foo
{'bar': 'baz', 'spam': 'eggs'}
>>> def changeit2(z):
...     if z is foo:
...         return "We don't want to mess with this, it affects global state."
...     else:
...         z['cro'] = 'magnon'
... 
>>> changeit2(foo)
"We don't want to mess with this, it affects global state."
zxq9
  • 13,020
  • 1
  • 43
  • 60
  • i hope i understand about 'is' but why the id's are different ? – James Sapam Sep 10 '14 at 04:53
  • 2
    Because you created two independent objects when you assigned x to 'abc' and y to 'abc'. You did not assign x to y or vice versa. They have no connection to one another. The fact that they carry the same value is coincidence as far as the system is concerned. – zxq9 Sep 10 '14 at 04:56
  • I expanded on this a bit with the reason for this feature and gave a (contrived) example of why this can be important in Python. – zxq9 Sep 10 '14 at 05:06
  • @zxq9 You are wrong. Even if you assign `x = 'abc'` and `y = 'abc'` Still they have the same id and they are identical. Because string is immutable and they both point to the same place. But once the string is modified, it no longer points to the same object. – cppcoder Sep 10 '14 at 05:07
  • If I understand the OP, what he really needs to understand is that Python doesn't automatically intern string values the way some languages do (and the way Python itself does with small integers), and maybe _why_ it doesn't. – abarnert Sep 10 '14 at 05:08
  • @cppcoder: If you assign `x = 'abc'` and `y = 'abc'` in the same module, and compile the module in the normal way, then with some Python implementations they will be identical. This is because, e.g., the CPython compiler does constant folding, not because Python does any kind of runtime interning. – abarnert Sep 10 '14 at 05:09
  • @cppcoder >>> x = 'a' >>> x += 'bc' >>> y = 'abc' >>> id(x) 140480966908944 >>> id(y) 140480967534912. BUT, you're right that x = 'abc' and y = 'abc' result in the same id. – zxq9 Sep 10 '14 at 05:10
  • @zxq9 I meant this. `>>> x = "123" >>> y = "123" >>> x == y True >>> x is y True >>> id(x) 50986112 >>> id(y) 50986112` – cppcoder Sep 10 '14 at 05:11
  • @abarnert might be able to expand on this a bit? – zxq9 Sep 10 '14 at 05:12
  • 1
    @abarnert In C++ if you declare a const string literal, it will be stored in read only memory and how many ever variables you declare with the same string value, all will point to the same memory location. Similarly, in python string is immutable and the same theory applies. Is that right? – cppcoder Sep 10 '14 at 05:17
  • 1
    @cppcoder: Yes, all strings are immutable, so the compiler can safely fold compile-time-equal constants together, and it does—and the runtime could safely intern strings too, it just doesn't. I think it might be worth writing an answer rather than comments at this point… – abarnert Sep 10 '14 at 05:18
  • @abarnert But alas, this is closed. Here is a place you can answer fully: http://stackoverflow.com/questions/25757850/how-does-cpython-handle-string-literals-in-memory Please do. It is an important point of misunderstanding by a lot of people! – zxq9 Sep 10 '14 at 05:23
  • @sapam The reason the ids are different is that we've *generated* a new value that happens to be the same instead of assigned one directly. Reference the comments above by cppcoder and abarnert about how this works in memory. I imagine it would be more expensive to check every string const in memory for *every* mutation operation as opposed to checking it once per fresh assignment. – zxq9 Sep 10 '14 at 05:26
  • @zxq9: Since this wasn't explained on [the question this was originally closed as a dup of](http://stackoverflow.com/questions/13650293/understanding-python-is-operator/25758019#25758019), I added an answer there. But in the intervening time, this one has been reopened and reclosed as a dup of [a different question](http://stackoverflow.com/questions/15541404/python-string-interning), that I think does already explain it pretty well, so maybe that was a waste of time… – abarnert Sep 10 '14 at 05:47
  • @zxq9: Be careful about your words; there are no such things are mutation operations on strings. `a += b` for immutable objects is identical to `a = a + b`—creation of a new object, then assignment. Also, "assignment" is a bit misleading if you're thinking in, say, C++ terms; all that `a =` does is make `a` in the namespace's dictionary a refer to the new object instead of the old one (and, in refcounted implementations, twiddle the refcounts); it doesn't check anything or copy anything. – abarnert Sep 10 '14 at 05:49
  • @abarnert Yes, you are correct. I should be more careful with the terms I use, as they have rather specific (and slightly different!) meanings in different languages. – zxq9 Sep 10 '14 at 06:09