25

I am just learning python and I am going though the tutorials on https://developers.google.com/edu/python/strings

Under the String Slices section

s[:] is 'Hello' -- omitting both always gives us a copy of the whole thing (this is the pythonic way to copy a sequence like a string or list)

Out of curiosity why wouldn't you just use an = operator?

s = 'hello';
bar = s[:] 
foo = s 

As far as I can tell both bar and foo have the same value.

Steven Smethurst
  • 4,495
  • 15
  • 55
  • 92
  • 6
    I would disagree with the sentence, this is not the pythonic way to copy a string. You are correct that you would just use an `=`. For lists it's a different story. – wim Jan 21 '13 at 07:08

4 Answers4

42

= makes a reference, by using [:] you create a copy. For strings, which are immutable, this doesn't really matter, but for lists etc. it is crucial.

>>> s = 'hello'
>>> t1 = s
>>> t2 = s[:]
>>> print s, t1, t2
hello hello hello
>>> s = 'good bye'
>>> print s, t1, t2
good bye hello hello

but:

>>> li1 = [1,2]
>>> li = [1,2]
>>> li1 = li
>>> li2 = li[:]
>>> print li, li1, li2
[1, 2] [1, 2] [1, 2]
>>> li[0] = 0
>>> print li, li1, li2
[0, 2] [0, 2] [1, 2]

So why use it when dealing with strings? The built-in strings are immutable, but whenever you write a library function expecting a string, a user might give you something that "looks like a string" and "behaves like a string", but is a custom type. This type might be mutable, so it's better to take care of that.

Such a type might look like:

class MutableString(object):
    def __init__(self, s):
        self._characters = [c for c in s]

    def __str__(self):
        return "".join(self._characters)

    def __repr__(self):
        return "MutableString(\"%s\")" % str(self)

    def __getattr__(self, name):
        return str(self).__getattribute__(name)

    def __len__(self):
        return len(self._characters)

    def __getitem__(self, index):
        return self._characters[index]

    def __setitem__(self, index, value):
        self._characters[index] = value

    def __getslice__(self, start, end=-1, stride=1):
        return str(self)[start:end:stride]


if __name__ == "__main__":
    m = MutableString("Hello")
    print m
    print len(m)
    print m.find("o")
    print m.find("x")
    print m.replace("e", "a") #translate to german ;-)
    print m
    print m[3]
    m[1] = "a"
    print m
    print m[:]

    copy1 = m
    copy2 = m[:]
    print m, copy1, copy2
    m[1] = "X"
    print m, copy1, copy2

Disclaimer: This is just a sample to show how it could work and to motivate the use of [:]. It is untested, incomplete and probably horribly performant

Thorsten Kranz
  • 12,492
  • 2
  • 39
  • 56
  • 2
    Please tell me you had such a class lying around already, and didn't put together that whole thing just for an answer like this :) – Karl Knechtel Jan 21 '13 at 09:28
  • Sorry, I have to disappoint you ;-) If you look at the commit-history of my answer, you can see it grow. – Thorsten Kranz Jan 21 '13 at 09:29
  • 2
    Using `[:]` on an arbitrary object does not guarantee that you get a copy. For example, a NumPy array returns a view on the original mutable data. A custom mutable string class might do the same. – Janne Karila Jan 21 '13 at 10:27
  • 1
    Sure, you have `copy`-module or specific methods for this, usually. I never stated that `[:]` creates a copy for any subscriptable object, only for the built-in types this is True. Nevertheless, if you ever come to write a custom string-like type and want to use it as parameter to a function expecting a string, you should definitely also make your type behave like a string. This includes behaviour on `[:]`. This is the principle of duck typing. If you don't stick to this, you'd better have good reasons to do so. – Thorsten Kranz Jan 21 '13 at 10:34
1

They have the same value, but there is a fundamental difference when dealing with mutable objects.

Say foo = [1, 2, 3]. You assign bar = foo, and baz = foo[:]. Now let's say you want to change bar - bar.append(4). You check the value of foo, and...

print foo
# [1, 2, 3, 4]

Now where did that extra 4 come from? It's because you assigned bar to the identity of foo, so when you change one you change the other. You change baz - baz.append(5), but nothing has happened to the other two - that's because you assigned a copy of foo to baz.

Note however that because strings are immutable, it doesn't matter.

Volatility
  • 31,232
  • 10
  • 80
  • 89
0

If you have a list the result is different:

l = [1,2,3]
l1 = l
l2 = l[:]

l2 is a copy of l (different object) while l1 is an alias of l which means that l1[0]=7 will modify also l, while l2[1]=7 will not modify l.

Emanuele Paolini
  • 9,912
  • 3
  • 38
  • 64
0

While referencing an object and referencing the object's copy doesn't differ for an immutable object like string, they do for mutable objects (and mutable methods), for instance list.

Same thing on mutable objects:

a = [1,2,3,4]
b = a
c = a[:]
a[0] = -1
print a    # will print [1,2,3,4]
print b    # will print [-1,2,3,4]
print c    # will print [1,2,3,4]

A visualization on pythontutor of the above example - http://goo.gl/Aswnl.

siddharthlatest
  • 2,237
  • 1
  • 20
  • 24