1

Referring to the following output from the python:

>>> x=254
>>> y=254
>>> id(x)
2039624591696  --> same as that of y
>>> id(y)
2039624591696  --> same as that of x
>>> x=300
>>> y=300
>>> id(x)
2039667477936 ---> different than y when value exceeds a limit of 256 
>>> id(y)
2039667477968 ----> 
>>> str7='g'*4096
>>> id(str7)
2039639279632  ---> same as that of str8
>>> str8='g'*4096
>>> id(str8)
2039639279632 ---> same as that of str7
>>> str9='g'*4097
>>> id(str9)
2039639275392 ----> ---> content is same as that of str10 but address is different than that of str10
>>> str10='g'*4097
>>> id(str10)
2039639337008

Here, as I define the str9 as 'g'*4097 it takes a different memory address than the str10, it seems there is some limit here, now my question is to find out these limits for the particular python release.

anuraag
  • 11
  • 3
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – MD. RAKIB HASAN Sep 05 '22 at 13:08
  • Hi @MD.RAKIBHASAN, I have updated the initial query, including an example of what I am trying to say. – anuraag Sep 07 '22 at 16:54

1 Answers1

1

Which integers and strings that get automatically interned in Python is implementation specific, and has changed between versions.

Here are some principles and limits that seem to hold at least for my current installation (CPython 3.10.7):

All integers in the range [-5, 256] are automatically interned:

>>> x = 256
>>> y = 256
>>> x is y
True
>>> x = 257
>>> y = 257
>>> x is y
False

CPython (version >= 3.7) also automatically interns strings if they are <= 4096 characters long, and only consist of ASCII letters, digits, and underscores. (In CPython versions <= 3.6, the limit was 20 characters).

>>> x = "foo"
>>> y = "foo"
>>> x is y
True
>>> x = "foo bar"
>>> y = "foo bar"
>>> x is y
False
>>> x = "A" * 4096
>>> y = "A" * 4096
>>> x is y
True
>>> x = "A" * 4097
>>> y = "A" * 4097
>>> x is y
False

In some versions the rule was apparently to intern strings looking like valid identifiers (e.g., not strings starting with a digit), but that does not appear to be the rule in my installation:

>>> x = "5myvar"
>>> y = "5myvar"
>>> x is y
True
>>> 5myvar = 5
  File "<stdin>", line 1
    5myvar = 5
    ^
SyntaxError: invalid decimal literal

Additionally, strings are interned at compile time, not at runtime:

>>> x = "bar"
>>> y = "".join(["b","a","r"])
>>> x
'bar'
>>> y
'bar'
>>> x is y
False

Relying on automatic string interning is risky (it depends on the implementation, which may change). To ensure a string is interned you can use the sys.intern() function:

>>> x = "a string which would not normally be interned!"
>>> y = "a string which would not normally be interned!"
>>> x is y
False
>>> import sys
>>> x = sys.intern(x)
>>> y = sys.intern(y)
>>> x is y
True