0

Let me preface this by saying I'm new to Python, come from Ruby, and I don't have much specific knowledge about how Python works.

For one of my current projects, I'm creating a new feature in a computational chemistry Django application that reads in PDBs and then does calculations on them. After adding my code, I was getting an error that Python can't typecast a string as a float, and looked at the library that parses the PDBs.

I was quickly confused by how Python's slice notation works. For example:

str = 'Hello this is Josh'
str[0:2] #=> 'He'
str[2] #=> 'l'

What I thought calling str[0:2] would result it would be Hel, not He, since index 0 to 2 is 3 big.

Is there a reason that this happens this way, and why str[m:n] gives from m to n-1, not from m to n?

josh
  • 9,656
  • 4
  • 34
  • 51

1 Answers1

6

It's so that:

str[0:2] + str[2:4] == str[0:4]

And

str[0:len(str)] == str

In general, it's conventional for sets of numbers to be defined this way; inclusive of the first listed number, exclusive of the second.

Esdgar Dijkstra wrote up a fairly well known argument for both this convention, and the convention of starting array indices at 0.

Brian Campbell
  • 322,767
  • 57
  • 360
  • 340
  • 2
    Ah I see. So the slice notation doesn't treat the two numbers as an inclusive `[m,n]` range as Ruby does, but rather a `[m,n)` range, which was my major confusion. – josh Apr 14 '14 at 04:42
  • 3
    Additionally, while you may have intuitively known (thought) that `[0:2]` was length 3, what about `[137:418]`? The advantage of Python's way, as the Dijkstra link notes, is that `n - m` is the length of the subsequence. (e.g., `[0:2]` is length `2-0=2` and `[137:418]` is length `418-137=281`) – Two-Bit Alchemist Apr 14 '14 at 04:48