I was confused today by a string comparison: it seems python reuses strings (which is a sensible thing to do, since they are immutable). To check this fact I did the following:
>>> a = 'xxx'
>>> b = 'xxx'
>>> a == b
True
>>> a is b
True
>>> id(a)
140141339783816
>>> id(b)
140141339783816
>>> c = 'x' * 3
>>> id(c)
140141339783816
>>> d = ''.join(['x', 'x', 'x'])
>>> id(d)
140141339704576
Which is a bit surprising. some questions:
- Does python check the whole content of its string table when defining new strings?
- Is there a limit to the string size?
- How does this mechanism work (comparing the hashes of the strings?)
- It does not seem to be used for all kind of generated strings though. What is the rule here?