59

As I only now noticed after commenting on this answer, slices in Python 3 return shallow copies of whatever they're slicing rather than views. Why is this still the case? Even leaving aside numpy's usage of views rather than copies for slicing, the fact that dict.keys, dict.values, and dict.items all return views in Python 3, and that there are many other aspects of Python 3 geared towards greater use of iterators, makes it seem that there would have been a movement towards slices becoming similar. itertools does have an islice function that makes iterative slices, but that's more limited than normal slicing and does not provide view functionality along the lines of dict.keys or dict.values.

As well, the fact that you can use assignment to slices to modify the original list, but slices are themselves copies and not views, is a contradictory aspect of the language and seems like it violates several of the principles illustrated in the Zen of Python.

That is, the fact you can do

>>> a = [1, 2, 3, 4, 5]
>>> a[::2] = [0, 0, 0]
>>> a
[0, 2, 0, 4, 0]

But not

>>> a = [1, 2, 3, 4, 5]
>>> a[::2][0] = 0
>>> a
[0, 2, 3, 4, 5]

or something like

>>> a = [1, 2, 3, 4, 5]
>>> b = a[::2]
>>> b
view(a[::2] -> [1, 3, 5])   # numpy doesn't explicitly state that its slices are views, but it would probably be a good idea to do it in some way for regular Python
>>> b[0] = 0
>>> b
view(a[::2] -> [0, 3, 5])
>>> a
[0, 2, 3, 4, 5]

Seems somewhat arbitrary/undesirable.

I'm aware of http://www.python.org/dev/peps/pep-3099/ and the part where it says "Slices and extended slices won't go away (even if the __getslice__ and __setslice__ APIs may be replaced) nor will they return views for the standard object types.", but the linked discussion provides no mention of why the decision about slicing with views was made; in fact, the majority of the comments on that specific suggestion out of the suggestions listed in the original post seemed to be positive.

What prevented something like this from being implemented in Python 3.0, which was specifically designed to not be strictly backwards-compatible with Python 2.x and thus would have been the best time to implement such a change in design, and is there anything that may prevent it in future versions of Python?

Community
  • 1
  • 1
JAB
  • 20,783
  • 6
  • 71
  • 80
  • I suspect the answer is likely that views would not work for technical reasons involving the low-level implementation of Python lists (which are optimised for speed). I will try to have a look around in the source code to back this up. – Katriel Aug 01 '11 at 18:06
  • 3
    But then, we need an alternative for my_list[:], doing "from copy import shallowcopy" would be awful. – utdemir Aug 01 '11 at 18:06
  • 7
    @utdemir One alternative is `list(my_list)`. It's not as pretty, but it's still fairly concise. – Ponkadoodle Aug 01 '11 at 18:14
  • 1
    @Wallacoloo: It also allows for intermixing of lists and iterators/generators/etc. (On the other hand, it would convert any tuples to lists, which may not be desirable if you want to keep the immutability.) – JAB Aug 01 '11 at 18:24
  • Actually, I suspect this would also cause a lot of subtle bugs in porting to Python 3, even if 2to3 did handle it. – Katriel Aug 01 '11 at 18:51
  • @ktrielalex: I can see how there could be subtle bugs in such cases as wrapping all `__getitem__(slice_object)` calls with calls to `list()`, due to how any type could have `__getitem__` defined, in which case you'd have to do a replacement like `myObjCopy = myObj[:]` -> `myObjCopy = type(myObj)(myObj)` or `myObj[::2]` -> `type(myObj)(myObj[::2])`. All built-in types already accept instances of themselves as valid arguments to their initializers and I think always return a copy of the original, so there's no more need to worry about that than if you were doing a copy of a non-builtin type. – JAB Aug 02 '11 at 14:26
  • (There's also the possibility of simply inserting `import copy` at the beginning of a file when such syntax is encountered and wrapping the `__getitem__` access with `copy.copy()`.) Of course, things would get difficult with all this if 2to3 were then applied to a script involving numpy, so there'd have to be consideration for that. (Or a note could simply be made to set that specific option to be ignored when 2to3 is applied to a numpy script, but then you'd get problems if numpy slicing ended up being mixed with built-in slicing.) – JAB Aug 02 '11 at 14:31
  • Ultimately, though, 2to3 isn't perfect even as it is now, and quite often after running 2to3 on a Python 2 script I've found myself having to go in and edit certain things manually to get it working in Python 3. – JAB Aug 02 '11 at 14:32
  • I was also hoping that P3K would fix [the treatment of negative indices](http://stackoverflow.com/questions/399067/extended-slice-that-goes-to-beginning-of-sequence-with-negative-stride) that effectively prevents using anything other than hard coded literals for negative indices. – max Apr 11 '12 at 20:13

2 Answers2

15

As well, the fact that you can use assignment to slices to modify the original list, but slices are themselves copies and not views.

Hmm.. that's not quite right; although I can see how you might think that. In other languages, a slice assignment, something like:

a[b:c] = d

is equivalent to

tmp = a.operator[](slice(b, c)) # which returns some sort of reference
tmp.operator=(d)        # which has a special meaning for the reference type.

But in python, the first statement is actually converted to this:

a.__setitem__(slice(b, c), d)

Which is to say that an item assignment is actually specially recognized in python to have a special meaning, separate from item lookup and assignment; they may be unrelated. This is consistent with python as a whole, because python doesn't have concepts like the "lvalues" found in C/C++; There's no way to overload the assignment operator itself; only specific cases when the left side of the assignment is not a plain identifier.

Suppose lists did have views; And you tried to use it:

myView = myList[1:10]
yourList = [1, 2, 3, 4]
myView = yourList

In languages besides python, there might be a way to shove yourList into myList, but in python, since the name myView appears as a bare identifier, it can only mean a variable assignemnt; the view is lost.

SingleNegationElimination
  • 151,563
  • 33
  • 264
  • 304
  • 6
    Numpy, at least, accounts for the bare-identifier bit by using `myView[:] = yourList` for that. The same thing could be done in regular Python. `a[:]`, syntactic sugar for `a.__getitem__(slice(None))`, would return a view of all items in `a` (or of the items in the underlying object referenced by `a` if `a` is itself a view) rather than a shallow copy of `a` by default. That in itself may actually be slightly unpythonic, but it doesn't seem like it'd be any more so than what's already done. – JAB Aug 01 '11 at 19:43
  • 2
    If you like numpy so much.. why not just use numpy? `np.array((), np.object_)` – SingleNegationElimination Aug 04 '11 at 17:15
  • 3
    Because the original intent was for something that would mesh better with Python's builtin types. And as shown in my own answer, while using numpy does indeed seem like the best solution in this situation, it's still not optimal, unless comparatively simple method exists for expanding an ndarray via slicing such that the object remains the same that I'm unaware of. – JAB Aug 04 '11 at 17:32
6

Well it seems I found a lot of the reasoning behind the views decision, going by the thread starting with http://mail.python.org/pipermail/python-3000/2006-August/003224.html (it's primarily about slicing strings, but at least one e-mail in the thread mentions mutable objects like lists), and also some things from:

http://mail.python.org/pipermail/python-3000/2007-February/005739.html
http://mail.python.org/pipermail/python-dev/2008-May/079692.html and following e-mails in the thread

Looks like the advantages of switching to this style for base Python would be vastly outweighed by the induced complexity and various undesirable edge cases. Oh well.

...And as I then started wondering about the possibility of just replacing the current way slice objects are worked with with an iterable form a la itertools.islice, just as zip, map, etc. all return iterables instead of lists in Python 3, I started realizing all the unexpected behavior and possible problems that could come out of that. Looks like this might be a dead end for now.

On the plus side, numpy's arrays are fairly flexible, so in situations where this sort of thing might be necessary, it wouldn't be too hard to use one-dimensional ndarrays instead of lists. However, it seems ndarrays don't support using slicing to insert additional items within arrays, as happens with Python lists:

>>> a = [0, 0]
>>> a[:1] = [2, 3]
>>> a
[2, 3, 0]

I think the numpy equivalent would instead be something like this:

>>> a = np.array([0, 0])  # or a = np.zeros([2]), but that's not important here
>>> a = np.hstack(([2, 3], a[1:]))
>>> a
array([2, 3, 0])

A slightly more complicated case:

>>> a = [1, 2, 3, 4]
>>> a[1:3] = [0, 0, 0]
>>> a
[1, 0, 0, 0, 4]

versus

>>> a = np.array([1, 2, 3, 4])
>>> a = np.hstack((a[:1], [0, 0, 0], a[3:]))
>>> a
array([1, 0, 0, 0, 4])

And, of course, the above numpy examples don't store the result in the original array as happens with the regular Python list expansion.

JAB
  • 20,783
  • 6
  • 71
  • 80
  • 5
    Could you give one example of why doing this with builtin Python slice would create undesirable edge cases? – max Apr 11 '12 at 19:51