77

So I just came across what seems to me like a strange Python feature and wanted some clarification about it.

The following array manipulation somewhat makes sense:

p = [1,2,3]
p[3:] = [4] 
p = [1,2,3,4]

I imagine it is actually just appending this value to the end, correct?
Why can I do this, however?

p[20:22] = [5,6]
p = [1,2,3,4,5,6]

And even more so this:

p[20:100] = [7,8]
p = [1,2,3,4,5,6,7,8]

This just seems like wrong logic. It seems like this should throw an error!

Any explanation?
-Is it just a weird thing Python does?
-Is there a purpose to it?
-Or am I thinking about this the wrong way?

Akaisteph7
  • 5,034
  • 2
  • 20
  • 43
  • 2
    In other languages I always end up writing this kind of stuff all over the place: `if i > sequence.length(): return sequence.slice(0, sequence.length()) else sequence.slice(0, n)` This is exactly the same as just using `sequence[:n]` in Python it saves you an if statement and 2 calls to `length`. – Bakuriu Feb 10 '19 at 22:14
  • 4
    BTW. You can look at slices as "sets". So `p[20:22]` is a sequence of all elements with indices between 20 and 22. The empty set is a valid set. That is way different than saying `p[20]` which asserts the existence of element with index 20. Hence the difference in range-checking between looking up an element vs a slice reflects the two different meanings. – Bakuriu Feb 10 '19 at 22:18
  • 2
    I think this a broader question about why adding sequences in slices of sequences that are of different length is allowed in Python and what are its benefits. The other question does not address at all the assignment part of this question. It just talks about the slicing. – Akaisteph7 Feb 11 '19 at 06:03

2 Answers2

81

Part of question regarding out-of-range indices

Slice logic automatically clips the indices to the length of the sequence.

Allowing slice indices to extend past end points was done for convenience. It would be a pain to have to range check every expression and then adjust the limits manually, so Python does it for you.

Consider the use case of wanting to display no more than the first 50 characters of a text message.

The easy way (what Python does now):

preview = msg[:50]

Or the hard way (do the limit checks yourself):

n = len(msg)
preview = msg[:50] if n > 50 else msg

Manually implementing that logic for adjustment of end points would be easy to forget, would be easy to get wrong (updating the 50 in two places), would be wordy, and would be slow. Python moves that logic to its internals where it is succint, automatic, fast, and correct. This is one of the reasons I love Python :-)

Part of question regarding assignments length mismatch from input length

The OP also wanted to know the rationale for allowing assignments such as p[20:100] = [7,8] where the assignment target has a different length (80) than the replacement data length (2).

It's easiest to see the motivation by an analogy with strings. Consider, "five little monkeys".replace("little", "humongous"). Note that the target "little" has only six letters and "humongous" has nine. We can do the same with lists:

>>> s = list("five little monkeys")
>>> i = s.index('l')
>>> n = len('little')
>>> s[i : i+n ] = list("humongous")
>>> ''.join(s)
'five humongous monkeys'

This all comes down to convenience.

Prior to the introduction of the copy() and clear() methods, these used to be popular idioms:

s[:] = []           # clear a list
t = u[:]            # copy a list

Even now, we use this to update lists when filtering:

s[:] = [x for x in s if not math.isnan(x)]   # filter-out NaN values

Hope these practical examples give a good perspective on why slicing works as it does.

iacob
  • 20,084
  • 6
  • 92
  • 119
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • 2
    "Even now, we use this to update lists when filtering _[example using `s[:]`]_" — Could you expand on why you'd use `s[:] =` there, instead of just `s =`? I've never seen anyone use `s[:] =` in the context of a line such as what you wrote there. Good answer otherwise! – Quuxplusone Feb 10 '19 at 21:44
  • 10
    @Quuxplusone: Slice assignment *mutates* the list already referenced by `s`; using `s =` *re-binds* `s` to refer to a new list. If the list can be reached via multiple names, and you want the mutation to be visible to all the names, slice assignment is what you want. Also, if `s` were global, reassigning `s` would require a `global` declaration, but slice assignment would have a similar effect even without the `global` statement. – Daniel Pryden Feb 11 '19 at 00:29
25

The documentation has your answer:

s[i:j]: slice of s from i to j (note (4))

(4) The slice of s from i to j is defined as the sequence of items with index k such that i <= k < j. If i or j is greater than len(s), use len(s). If i is omitted or None, use 0. If j is omitted or None, use len(s). If i is greater than or equal to j, the slice is empty.

The documentation of IndexError confirms this behavior:

exception IndexError

Raised when a sequence subscript is out of range. (Slice indices are silently truncated to fall in the allowed range; if an index is not an integer, TypeError is raised.)

Essentially, if len(p) < 20 stuff like p[20:100] is being reduced to p[len(p):len(p)]. p[len(p):len(p)] is an empty slice at the end of the list, and assigning a list to it will modify the end of the list to contain said list. Thus, it works like appending/extending the original list.

This behavior is the same as what happens when you assign a list to an empty slice anywhere in the original list. For example:

In [1]: p = [1, 2, 3, 4]

In [2]: p[2:2] = [42, 42, 42]

In [3]: p
Out[3]: [1, 2, 42, 42, 42, 3, 4]
Akaisteph7
  • 5,034
  • 2
  • 20
  • 43
iz_
  • 15,923
  • 3
  • 25
  • 40
  • 4
    I don't think OP is asking how slicing works, he's asking for the rationale behind the design choice. – Primusa Feb 10 '19 at 06:10
  • 3
    @Primusa - I believe they're asking _both_. This explains the _how_, which is good to know because it explains why the behavior isn't broken. The _why_ is probably buried in the depths of one of the mailing lists somewhere. – g.d.d.c Feb 10 '19 at 06:14
  • Good answer but this doesn't explain why the new numbers get appended to the end of the list. – Atirag Feb 10 '19 at 06:15
  • 1
    @Atirag I added a small blurb about it for completeness. – iz_ Feb 10 '19 at 06:22
  • It is a bit confusing though that p[len(p):len(p)] is empty but p[len(p)] is out of range. Following the logic from the former I would assume p[len(p)] =[c,d] would also append the values but it won't of course. – Atirag Feb 10 '19 at 06:29
  • 1
    @Atirag Indexing is very different from slicing; indexing always refers to values. – iz_ Feb 10 '19 at 06:30
  • Yeah I guess the definition is clear about it just the syntax is confusing. – Atirag Feb 10 '19 at 06:31
  • Thank you for the documentation. It made it easier to understand the how. – Akaisteph7 Feb 10 '19 at 08:00
  • @Programmer What else needs to be explained? Please add some more details as to what you think is wrong. – iz_ Apr 05 '19 at 23:13