114

Why doesn't 'example'[999:9999] result in error? Since 'example'[9] does, what is the motivation behind it?

From this behavior I can assume that 'example'[3] is, essentially/internally, not the same as 'example'[3:4], even though both result in the same 'm' string.

martineau
  • 119,623
  • 25
  • 170
  • 301
ijverig
  • 2,795
  • 3
  • 18
  • 26
  • 26
    `[999:9999]` isn't an index, it's a slice, and has different semantics. From the python intro: "Degenerate slice indices are handled gracefully: an index that is too large is replaced by the string size, an upper bound smaller than the lower bound returns an empty string." – Wooble Feb 28 '12 at 21:35
  • 2
    @Wooble that is the actual answer – jondavidjohn Feb 28 '12 at 21:38
  • 2
    @Wooble And do you know why it’s this way? Thank you for your clarification. – ijverig Feb 28 '12 at 21:48
  • Why? You'd have to ask Guido, but I think it's elegant to be able to assume a slice is always the same type of sequence as the original sequence, myself. – Wooble Feb 28 '12 at 23:37
  • @Wooble Yep, true. I think I’ve misunderstood [a:b] as if it was a [a..b] kind of thing… – ijverig Feb 28 '12 at 23:51
  • Has someone actually used this behavior in real code? Slices represent a subset of a lists indexes and by definition a subset is included in the set. For me, if foo = [0,1,2,3] and I slice foo[-42:1337] than it looks more like a bug than like an intended slicing. The thing is we got wonderful objects to signal a bug to the user: exceptions. Maybe this idea was borrowed from php... – Lapinot Feb 03 '16 at 20:40
  • 1
    @Lapinot yes I've written code that depends on this behavior. Unfortunately I can't remember the exact code so I can't tell you why. Probably had to do with substrings; getting an empty string can be exactly what you want at times. – Mark Ransom Nov 16 '18 at 03:44
  • @Lapinot I think subset isn't quite right. "Intersection" would be more correct, and according to that idiom this behavior makes a lot of sense. I've used this frequently in cases where I want to iterate over a subsequence that may or may not exist. If an empty list is returned, the for block just doesn't execute at all. This saves at least one and often two or three explicit if statements, and avoids the overhead of exception handling in tight loops. – senderle Dec 14 '18 at 19:06

3 Answers3

81

You're correct! 'example'[3:4] and 'example'[3] are fundamentally different, and slicing outside the bounds of a sequence (at least for built-ins) doesn't cause an error.

It might be surprising at first, but it makes sense when you think about it. Indexing returns a single item, but slicing returns a subsequence of items. So when you try to index a nonexistent value, there's nothing to return. But when you slice a sequence outside of bounds, you can still return an empty sequence.

Part of what's confusing here is that strings behave a little differently from lists. Look what happens when you do the same thing to a list:

>>> [0, 1, 2, 3, 4, 5][3]
3
>>> [0, 1, 2, 3, 4, 5][3:4]
[3]

Here the difference is obvious. In the case of strings, the results appear to be identical because in Python, there's no such thing as an individual character outside of a string. A single character is just a 1-character string.

(For the exact semantics of slicing outside the range of a sequence, see mgilson's answer.)

senderle
  • 145,869
  • 36
  • 209
  • 233
  • 1
    An index out of range could have returned `None` instead of erroring out - that's the usual Python convention when you have nothing to return. – Mark Ransom Feb 28 '12 at 22:30
  • 11
    @MarkRansom, that's true; but returning `None` in this case would make it harder to tell between an out-of-bounds index and a `None` value inside a list. But even if there were a workaround for that, it remains clear to me that returning an empty sequence is the right thing to do when given an out-of-bounds slice. It's analogous to performing the union of two disjoint sets. – senderle Feb 28 '12 at 22:49
  • Just to be clear, I didn't say you were wrong. I see your point about `None` values in a list. – Mark Ransom Feb 28 '12 at 22:53
  • 1
    @MarkRansom, I know -- sorry if I sounded defensive. Really I just wanted an excuse to refer to set theory :). – senderle Feb 28 '12 at 23:02
  • 5
    Aw, except I said "union" instead of "intersection." – senderle Apr 09 '14 at 14:34
  • I understand that reading slices does not do bounds checking, but I don't fully understand why writing slices does't. It's strange that the following does not raise an error: `x = [1, 2, 3]; x[100:105] = [4]` – speedplane Nov 14 '16 at 23:14
  • @speedplane, I'd never seen that example before, and it does seem a little odd. Within a list it makes sense to me: `x = [1, 2, 3, 4, 5]; x[1:4] = [-1]; x == [1, -1, 5]` looks reasonable. "Replace the selected portion of the list with this new list," it seems to say. So the question is, why does `x[100:105]` select the empty list at the end of `x`. And I think the answer must be that there really isn't a less surprising option. Extend the list with padding? _Very_ surprising! No error for reading but error for writing? Somewhat surprising. Appending the value? Just a little surprising. – senderle Nov 15 '16 at 00:58
  • With the benefit of a few years of hindsight, the `None` in a list argument doesn't hold. There's a definite difference between `None` and `[None]`. I still don't disagree with your answer though. – Mark Ransom Nov 16 '18 at 03:47
  • It's been a while since I thought about it, but I guess I was worried about telling between "None, because that is the value in the list at i" and "None, because the list isn't long enough to have an element at i, and we're avoiding throwing an exception." Is there a way for the None vs [None] distinction help in that case? – senderle Dec 14 '18 at 18:24
  • I wish Python took it a bit further. x = [1,2,3,4,5], x[2:5] is like [x[i] for i in 2:5 if i in 0:len(x)]. In Matlab, you can pass any array of indices. So equivalently: x[[0,1,1,0]] would return [1,2,2,1]. Of course this can be done with list comprehension, but so could conventional slices – Jacob Lee Aug 02 '20 at 18:57
42

For the sake of adding an answer that points to a robust section in the documentation:

Given a slice expression like s[i:j:k],

The slice of s from i to j with step k is defined as the sequence of items with index x = i + n*k such that 0 <= n < (j-i)/k. In other words, the indices are i, i+k, i+2*k, i+3*k and so on, stopping when j is reached (but never including j). When k is positive, i and j are reduced to len(s) if they are greater

if you write s[999:9999], python is returning s[len(s):len(s)] since len(s) < 999 and your step is positive (1 -- the default).

wjandrea
  • 28,235
  • 9
  • 60
  • 81
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Presumably when `k` is positive, `i` and `j` are also increased to `-len(s)` when they are lesser? e.g. `s = 'bac'; s[-100:2] == s[-len(s):2]` – Chris_Rands Jul 04 '17 at 12:56
  • @Chris_Rands When `k` is positive, Python will scale `i` and `j` so that they fit the bounds of the sequence. In your example, `s[-100:2] == s[0:2]` (`== s[-len(s):2]`, by the way). Similarly, `s[-100:100] == s[0:2]`. – tylerc0816 Aug 18 '17 at 13:26
  • Nice, thanks. This is a better response to @speedplane's comment above. – senderle Dec 14 '18 at 18:51
7

Slicing is not bounds-checked by the built-in types. And although both of your examples appear to have the same result, they work differently; try them with a list instead.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358