4

In PEP-616, the specification for removeprefix() includes this code block:

def removeprefix(self: str, prefix: str, /) -> str:
    if self.startswith(prefix):
        return self[len(prefix):]
    else:
        return self[:]

Why does the last line say return self[:], instead of just return self?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Philip Massey
  • 1,401
  • 3
  • 14
  • 24
  • 1
    It is a common, but outdated idiom for copying a sequence. The string class lacks a copy method, since it isn't ordinarily necessary to copy a string; and this [shouldn't work anyway](https://stackoverflow.com/questions/24804453/how-can-i-copy-a-python-string). Or is the question "why does the specification say to copy the string?"? – Karl Knechtel Dec 24 '22 at 23:45
  • 1
    Probably just for symmetry with the `if` clause, but please understand, that is not the actual implementation. – juanpa.arrivillaga Dec 24 '22 at 23:46
  • ... in order to return a *new* string, `self` does not change. – Maurice Meyer Dec 24 '22 at 23:46
  • No, that's certainly it, which makes sense! It's a simpler idiom to copy the string as the return value. Thanks! – Philip Massey Dec 24 '22 at 23:46
  • @MauriceMeyer but it *doesn't actually return a new string*. – juanpa.arrivillaga Dec 24 '22 at 23:47
  • 1
    @PhilipMassey it doesn't actually copy the string. This is easy to verify, `s = 'foo'; print(s is s.removeprefix('bar'))`. – juanpa.arrivillaga Dec 24 '22 at 23:48
  • Ah, no, wait; I noticed a subtlety. – Karl Knechtel Dec 24 '22 at 23:48
  • Here's the actual implementation: https://github.com/python/cpython/blob/046cbc2080360b0b0bbe6ea7554045a6bbbd94bd/Objects/unicodeobject.c#L11933 I am not sure what [`unicode_result_unchanged`](https://github.com/python/cpython/blob/046cbc2080360b0b0bbe6ea7554045a6bbbd94bd/Objects/unicodeobject.c#L609) is doing.... but there's a lot of subtleties going on underneath the hood for unicode objects/ – juanpa.arrivillaga Dec 24 '22 at 23:50
  • 1
    @juanpa.arrivillaga perhaps for you it's easy to verify :) thanks for the reminder to update Python. – Karl Knechtel Dec 24 '22 at 23:58

1 Answers1

7

[:] is an old idiom for copying sequences. Nowadays, we use the idiomatic .copy for lists; there isn't normally a good reason to copy strings, since they are supposed to be immutable, so the str class doesn't provide such a method. Furthermore, due to string interning, [:] may well return the same instance anyway.

So, why include it in code like this?

Because str can be subclassed. The clue is in the subsequent text:

When the arguments are instances of str subclasses, the methods should behave as though those arguments were first coerced to base str objects, and the return value should always be a base str.

Suppose we had a user-defined subclass:

class MyString(str):
    ...

Notice what happens when we slice an instance to copy it:

>>> type(MyString('xyz')[:])
<class 'str'>

In the example implementation, therefore, the [:] ensures that an instance of the base str type will be returned, conforming to the text specification.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153