2

I understand that the following behavior is not guaranteed:

>>> x = "Hello, world!"
>>> y = "Hello, world!"
>>> assert x is y

But, is this behavior guaranteed:

>>> x = "Hello, world!"
>>> y = str(x)
>>> assert x is y

If so, what is the correct way to implement this behavior in my own (immutable) classes?


EDIT: By "this behavior" I mean "a constructor for a class that should reuse an existing instance":

>>> x = MyClass("abc", 123)
>>> y = MyClass(x)
>>> assert x is y
Community
  • 1
  • 1
Jace Browning
  • 11,699
  • 10
  • 66
  • 90

3 Answers3

1

x is y is really checking id(x) is id(y), i.e. do those two references point to the same object. str in CPython will return the same object if it is already a string, so from that perspective the behaviour you describe:

y = str(x)
assert x is y

will work for all x where isinstance(x, str). From the documentation:

For strings, [str] returns the string itself.

I'm not sure I would consider this a guarantee so much an implementation detail, though, and would try to avoid writing code that relied on it.

There are numerous resources on SO and elsewhere on implementing immutable custom classes in Python, so I will not repeat that information here.

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
  • Thanks! Can you link to one? I don't need to need to know how to implement the **whole** immutable class, just the proper way to implement the constructor. Should the type check occur in `__new__`? – Jace Browning May 06 '14 at 14:22
  • @JaceBrowning there are several in the **Related** block to the right of the page --> – jonrsharpe May 06 '14 at 14:27
1

Looks like PyObject_Str function is responsible for conversion to str object. If so, here's what it does when it receives str object as argument (v):

    if (PyUnicode_CheckExact(v)) {
#ifndef Py_DEBUG
        if (PyUnicode_READY(v) < 0)
            return NULL;
#endif
        Py_INCREF(v);
        return v;
    }

It just increases the reference count and returns that object without changes - this is why the object in the following example stays the same:

>>> x = 'long string' * 1000
>>> str(x) is x
True

This is indeed an implementation detail, therefore it may vary across different Python versions and implementations.

Also you may find interesting answers to my question Python's int function performance.

Community
  • 1
  • 1
vaultah
  • 44,105
  • 12
  • 114
  • 143
1

Override __new__() to return the passed-in object:

class C(object):
  def __new__(cls, ref):
    if isinstance(ref, C):
      return ref
    else:
      return super(C, cls).__new__(cls)

  def __init__(self, ref):
    if self is not ref:
      self.x = ref

c1=C(7)
c2=C(13)
assert c1 is not c2
c3=C(c1)
assert c1 is c3
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • In `__new__`, can the spots where `C` is used be replaced with `cls`? Or will that mess up inheritance? – Jace Browning May 07 '14 at 01:30
  • I think the `super()` calls has to be the way it is. I think the `isinstance()` call can be either way, resulting in slightly different semantics in derived classes. Specifically, you'll need to choose how you want `derived_obj = DERIVED(base_obj)` to behave, and adjust `isinstance()` accordingly. – Robᵩ May 07 '14 at 01:33