0

I can't seem to be able to create two strings that have the same value but different identities. I'm using PyCharm 2023.1 and I'm very new to Python, I've just learned about the difference between == and is operators and I want to actually see the difference for myself but no matter what I try Python just interns my strings and returns True!

I should say that I'm aware that this feature is there for optimization purposes and I shouldn't rely on it to always use is instead of ==. I really just want to see the difference for myself.

Here's my original code:

firststr= 'blah blah blah'
print(id(firststr))

secondstr= 'blah blah blah'
print(id(secondstr))


print(firststr is secondstr)

And here's the result I got:

2768763221232
2768763221232
True

First I googled the problem and found this: https://www.jetbrains.com/help/pycharm/variables-loading-policy.html I thought maybe changing the variable loading policy would help. I thought maybe because it's set on Asynchronous and it creates objects one by one it just gives new objects that have the same value as a previously created one the same id and set it so Synchronous but nothing changed!

I asked ChatGPT twice and got two different answers:

firststr = 'blah blah blah'
print(id(firststr ))

secondstr= str('blah blah blah' + '')
print(id(secondstr))


print(firststr is secondstr)

and

firststr = 'blah blah blah'
print(id(firststr))

secondstr = 'blah blah blah'[:]
print(id(secondstr))


print(firststr is secondstr)

Neither of them worked and I still got the same results.

I found an old post on Stackoverflow that recommended using the Copy() method:

import copy

firststr = 'blah blah blah'
print(id(firststr ))

secondstr= copy.copy(firststr)
print(id(secondstr))


print(firststr is secondstr)

and

import copy

firststr = 'blah blah blah'
print(id(firststr ))

secondstr= copy.deepcopy(firststr)
print(id(secondstr))


print(firststr is secondstr)

But AGAIN, neither of them worked and I got the same results.

Does anyone know how I'm actually supposed to create two strings with the same value but different identities?

  • I don't think that it is possible because of the way python works. Why would you want to do that anyway? – ShadowCrafter_01 Jun 14 '23 at 07:08
  • It probably won't work with strings but with lists. `[0] is [0]` will be false although the lists are equal. – Michael Butscher Jun 14 '23 at 07:12
  • Don’t try identity experiments on `int` and `str` objects, small strings and integers may have implementation details for optimisation purposes (see e.g. https://stackoverflow.com/questions/10622472/what-determines-which-strings-are-interned-and-when). `list`, `dict`, and `set` are better to learn with. – dROOOze Jun 14 '23 at 07:23
  • By the way: Publishing content generated by ChatGPT (not only answers) [isn't allowed currently](https://meta.stackoverflow.com/questions/421831/temporary-policy-chatgpt-is-banned) – Michael Butscher Jun 14 '23 at 07:27
  • Generally, string literals will be interned by the CPython interpreter if they contain alphanumeric characters and underscores only. A non-alphanumeric character (even whitespace) should prevent them being interned. So when I do `firststr = 'blah blah blah'` then `secondstr = 'blah blah blah'` then `print(firststr is secondstr)` I get False. In other words, I can't reproduce your original result. – slothrop Jun 14 '23 at 08:07
  • 1
    @slothrop youa re running it in a REPL, so they are in seperate code blocks. If you put it in a module and execute that module or a function (even one in the REPL) you'll see this behavior – juanpa.arrivillaga Jun 14 '23 at 08:17
  • 1
    @slothrop or in the REPL, force a single block: `x = "blah blah blah"; y = "blah blah blah"; print(x is y)` – juanpa.arrivillaga Jun 14 '23 at 08:19
  • @juanpa.arrivillaga aha, of course! – slothrop Jun 14 '23 at 08:33

3 Answers3

1

Does anyone know how I'm actually supposed to create two strings with the same value but different identities?

You aren't supposed to. Python reserves the right to optimize built-in types that it has deemed "immutable". The language doesn't give you any way that you are supposed to create two objects of an immutable type with the same value but different identities. The language assumes you wouldn't ever care about that distinction for these types.

You can rely on knowledge about implementation details to avoid certain places where you know it will get optimized - for example, CPython does constant folding for built-in literals, and if it encounters the same value, it just re-uses an object it has already created if it's in the same code block. It even optimizes expressions involving only literals, so you might at first think you'll be clever and construct a new string with the concatenation operator, but you will see:

>>> print("foo" + "bar" is "foobar")
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
True

Note, the latest versions of Python actually print a warning.

This happens for other built-in immutable types as well, even container types like tuples (of course, only for tuples that only container other built-in immutable types:

>>> print((("foo",)+("bar",)) is ('foo', 'bar'))
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
True

What's actually happening is that Python is optimizing this in the bytecode to refer to a single object:

>>> dis.dis('''print((("foo",)+("bar",)) is ('foo', 'bar'))''')
<dis>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
  1           0 LOAD_NAME                0 (print)
              2 LOAD_CONST               0 (('foo', 'bar'))
              4 LOAD_CONST               0 (('foo', 'bar'))
              6 IS_OP                    0
              8 CALL_FUNCTION            1
             10 RETURN_VALUE

However, if your expression doesn't involve only literals, as one might expect, concatenation creates a new string (since python isn't interning every string created):

>>> x = "foo"; y = "bar"; print(x+y is "foobar")
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
False

But the language makes no guarantees about anything. You still might encounter cases where other implementation details might make you unexpectedly unable to create strings with the same value but different identities. For example, Python interns strings that are used in instance dictionaries:

>>> class Foo:
...     def __init__(self):
...         self.foo = 0
...         self.bar = 0
...
>>> Foo()
<__main__.Foo object at 0x103207100>
>>> Foo().__dict__
{'foo': 0, 'bar': 0}
>>> next(iter(Foo().__dict__))
'foo'
>>> "foo" is next(iter(Foo().__dict__))
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
True

So to reiterate, Python gives you no way you are supposed to do this. And if you find a way, it is always an implementation detail, that is free to change unannounced.

>>> import sys; print(sys.version)
3.9.13 (main, May 24 2022, 21:28:44)
[Clang 13.0.0 (clang-1300.0.29.30)]
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
0

You can achieve this by converting the initial strings to lists:

First time:

string = 'blah blah blah'
lst = list(string)
print(lst)
print(id(lst))

Output:

['b', 'l', 'a', 'h', ' ', 'b', 'l', 'a', 'h', ' ', 'b', 'l', 'a', 'h']
139621090396352

Second time:

string = 'blah blah blah'
lst = list(string)
print(lst)
print(id(lst))

Output:

['b', 'l', 'a', 'h', ' ', 'b', 'l', 'a', 'h', ' ', 'b', 'l', 'a', 'h']
139621090349184

Why you need to do this is entirely different question because it seems to be an anti-pattern for id() method.

Geom
  • 1,491
  • 1
  • 10
  • 23
0

I am writing this solution on phone so I can't check this but you can try this two solutions:

First:

string1 = "blah blah"
string2 = string1[:]

Second:

string1 = "blah blah"
help_list = list(string1)
string2 = ''.join(help_list)
KacperG
  • 1
  • 1